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SEQUENCE LISTING 





(1) GENERAL INFORMATION: 

(1) APPLICANT.-Kaichman, Michael 
ayden. Michael R. 

HaVkam, Abigail 
Chopra, Vikramjit Singh 
Nichor^on, Donald W. 
VallaincVirt, John P. 
Rasper, Drte M. 

(ii) TITLE OF INVENTION: Apoptosis Modulators That Interact with the 
Huntington's Wisease Gene 

(iii) NUMBER iOF SEQUENCES: 44 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRES SEE\Oppedahl & Larson 

(B) STREET: PO bV 5270 

(C) CITY: Frisco 

(D) STATE: CO 

(E) COUNTRY: USA 

(F) ZIP: 80443-5270 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: DiskettX 3.50 inch, 1.44 Kb storage 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: MsNpOS 5.0 

(D) SOFTWARE: WordPerfect 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION 

(A) NAME: Larson, Marina T. 

(B) REGISTRATION NUMBER: 32038 

(C) REFERENCE/DOCKET NUMBER: UBC.P\013US2 

(ix) TELECOMMUNICATION INFORMATION^ 

(A) TELEPHONE: (970) 668-2050 

(B) TELEFAX: (970) 668-2052 

(2) INFORMATION FOR SEQ ID NO: 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1164 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
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(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: cDNA for Huntingtin-interacting protein 
(xi)SEQUENCE DESCRIPTION: SEQ ID NOT: 

ACAGCTGACA CCCTGCAAGG CCACCGGGAC CGCTTCATGG AGCAGTTTAC 5 0 

AAAGTTGAAA GATCTGTTCT ACCGCTCCAG CAACCTGCAG TACTTCAAGC 100 

GGGTCATTCA GATCCCCCAG CTGCCTGAGA ACCCACCCAA CTTCCTGCGA 150 

GCCTCAGCCC TGTCAGAACA TATC AG CCCT GTGGTGGTGA TCCCTGCAGA 2 00 

GGCCTCATCC CCCGACAGCG AGCCAGTCCT AGAGAAGGAT GACCTCATGG 250 

ACATGGATGC CTCTCAGCAG AATTTATTTG ACAAGAAGTT TGATGACNTC 300 

TTTGGCAGTT CATCCAGCAG TGATCCCTTC AATTTCAACA GTCAAAATGG 35 0 

TGTGAACAAG GATGAGAAGG ACCACTTAAT TGAGCGACTA TACAGAGAGA 40 0 

TCAGTGGATT GAAGGCACAG CTAGAAAACA TGAAGACTGA GAGCCAGCGG 45 0 

GTTGTGCTGC AGCTGAAGGG CCACGTCAGC GAGCTGGAAG CAGATCTGGC 5 00 

CGAGCAGCAG CACCTGCGGC AGCAGGCGGC CGACGACTGT GAATTCCTGC 55 0 

GGGCAGAACT GGACGAGCTC AGGNGGCAGC GGGAGGACAC CGAGAAGGCT 6 00 

CAGCGGAGCC TGTCTGAGAT AGAAAGGAAA GCTCAAGCGA ATGAACAGCG 65 0 

ATATAGCAAG CTAAAGGAGA AGTACAGCGA GCTGGTTCAG AACCACGCTG 7 00 

ACCTGCTGCG GAAGAATGCA GAGGTGACCA AACAGGTGTC CATGGCCAGA 7 50 

CAAGCCCAGG TAGATTTGGA AC GAG AG AAA AAAGAGCTGG AGGATTCGTT 800 

GGAGCGCATC AGTGACCAGG GCCAGCGGAA GACTCAAGAA CAGCTGGAAG 85 0 

TTCTAGAGAG CTTGAAGCAG GAACTTGGCA CAAGCCAACG GGAGCTTCAG 900 

GTTCTGCAAG GCAGCCTGGA AACTTCTGCC CAGTCAGAAG CAAACTGGGC 95 0 

AGCCGAGTTC GCCGAGCTAG AGAAGGAGCG GGACAGCCTG GTGAGTGGCG 1000 

CAGCTCATAG GGAGGAGGAA TTATCTGCTC TTCGGAAAGA ACTGCAGGAC 10 5 0 

ACTCAGCTCA AACTGGCCAG CACAGAGGAA TCTATGTGCC AGCTTGCCAA 110 0 

AGACCAACGA AAAATGCTTC TGGTGGGGTC CAGGAAGGCT GCGGAGCAGG 115 0 
TGATACAAGA CGCG 1164 

(2) INFORMATION FOR SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 386 

(B) TYPE: protein 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: Huntingtin-interacting protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

Thr Ala Asp Thr Leu Gin Gly His Arg Asp Arg Phe Met Glu Gin 
15 10 15 

Phe Thr Lys Leu Lys Asp Leu Phe Tyr Arg Ser Ser Asn Leu Gin 



Tyr Phe Lys Arg Val lie Gin lie Pro Gin Leu Pro Glu Asn Pro 
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35 40 45 

Pro Asn Phe Leu Arg Ala Ser Ala Leu Ser Glu His lie Ser Pro 

50 55 60 

Val Val Val lie Pro Ala Glu Ala Ser Ser Pro Asp Ser Glu Pro 

65 70 75 

Val Leu Glu Lys Asp Asp Leu Met Asp Met Asp Ala Ser Gin Gin 

80 85 90 

Asn Leu Phe Asp Asn Lys Phe Asp Asp Phe Gly Ser Ser Ser Ser 

95 100 105 

Ser Asp Pro Phe Asn Phe Asn Ser Gin Asn Gly Val Asn Lys Asp 

110 115 120 

Glu Lys Asp His Leu lie Glu Arg Leu Tyr Arg Glu lie Ser Gly 

125 130 135 

Leu Lys Ala Gin Leu Glu Asn Met Lys Thr Glu Ser Gin Arg Val 

140 145 150 

Val Leu Gin Leu Lys Gly His Val Ser Glu Leu Glu Ala Asp Leu 

155 160 165 

Ala Glu Gin Gin His Leu Arg Gin Gin Ala Ala Asp Asp Cys Glu 

170 175 180 

Phe Leu Arg Ala Glu Leu Asp Glu Leu Arg Gin Arg Glu Asp Thr 

185 190 195 

Glu Lys Ala Gin Arg Ser Leu Ser Glu lie Glu Arg Lys Ala Gin 

200 205 210 

Ala Asn Glu Gin Arg Tyr Ser Lys Leu Lys Glu Lys Tyr Ser Glu 

215 220 225 

Leu Val Gin Asn His Ala Asp Leu Leu Arg Lys Asn Ala Glu Val 

230 235 240 

Thr Lys Gin Val Ser Met Ala Arg Gin Ala Gin Val Asp Leu Glu 

245 250 255 

Arg Glu Lys Lys Glu Leu Glu Asp Ser Leu Glu Arg lie Ser Asp 

260 265 270 

Gin Gly Gin Arg Lys Thr Gin Glu Gin Leu Glu Val Leu Glu Ser 

27.5 280 285 

Leu Lys Gin Glu Leu Gly Thr Ser Gin Arg Glu Leu Gin Val Leu 
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290 295 300 

Gin Gly Ser Leu Glu Thr Ser Ala Gin Ser Glu Ala Asn Trp Ala 

305 310 315 

Ala Glu Phe Ala Glu Leu Glu Lys Glu Arg Asp Ser Leu Val Ser 

320 325 330 

Gly Ala Ala His Arg Glu Glu Glu Leu Ser Ala Leu Arg Lys Glu 

335 340 345 

Leu Gin Asp Thr Gin Leu Lys Leu Ala Ser Thr Glu Glu Ser Met 

350 355 360 

Cys Gin Leu Ala Lys Asp Gin Arg Lys Met Leu Leu Val Gly Ser 

365 370 375 

Arg Lys Ala Ala Glu Gin Val lie Gin Asp Ala 

380 385 386 



(2) INFORMATION FOR SEQ ID NO:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4796 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: cDNA for Huntingtin-interacting protein 
(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CAGTGTACGG TTGATCATAT AACGCCGCGG GCGGGGATTG GTTTATATAT 50 

CGCAAATTGA TNTAGGGGGG GGGGGATGGN CAGAGATTTC GCTTCATTAG 100 

GCCATTATAA GCAGGAAGGG TTTCAAGGAA AAAAACCCAG AAAGTGCATA 150 

TTGCACCCAC CATGAGAAAG GGGCAACAGA CCTTNTGTTN TGTTNTCAAC 2 00 

CGCCTGCTTC TGTTTTAGCA ACGCAGTGTT TTGGTGGAAG TTGTGCCATG 250 

TGTTCCACAA A3KTTCTTCCGA GATGGACACC CGAACGTCCT GAAGGACTTT 3 00 

GTGAGATACA GAAATGAATT GAGTGACATG AGCAGGATGT GGGGCCACCT 3 50 

GAGCGAGGGG TATGGCCAGC TGTGCAGCAT CTACCTGAAA CTGCTAAGAA 400 

CCAAGATGGA GTACCACACC AAAAATC C C A GGTTCCCAGG CAACCTGCAG 45 0 

ATGAGTGACC GCCAGCTGGA CGAGGCTGGA GAAAGTGACG TGAACAACTT 5 00 

TTTCCAGTTA ACAGTGGAGA TGTTTGACTA CCTGGAGTGT GAACTCAACC 5 50 

TCTTCCAAAC AGTATTCAAC TCCCTGGACA TGTCCCGCTC TGTGTCCGTG 600 

ACGGCAGCAG GGCAGTGCCG CCTCGCCCCG CTGATCCAGG TCATCTTGGA 650 

CTGCAGCCAC C TTTATG AC T ACACTGTCAA GCTTCTCTTC AAACTCCACT 700 

CCTGCCTCCC AGCTGACACC CTGGAAGGCC ACCGGGACCG CTTCATGGAG 7 50 
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CAGTTTACAA AGTTGAAAGA TCTGTTCTAC CGCTCCAGCA ACCTGCAGTA 800 

CTTCAAGCGG CTCATTCAGA TCCCCCAGCT GCCTGAGAAC CCACCCAACT 85 0 

TCCTGCGAGC CTCAGCCCTG TCAGAACATA TCAGCCCTGT GGTGGTGATC 900 

CCTGCAGAGG CCTCATCCCC CGACAGCGAG CCAGTCCTAG AGAAGGATGA 95 0 

CCTCATGGAC ATGGATGCCT CTCAGCAGAA TTTATTTGAC AACAAGTTTG 1000 

ATGACATCTT TGGCAGTTCA TTCAGCAGTG ATCCCTTCAA TTTCAACAGT 105 0 

CAAAATGGTG TGAACAAGGA TGAGAAGGAC CACTTAATTG AG C G AC TATA 110 0 

CAGAGAGATC AGTGGATTGA AGGCACAGCT AGAAAACATG AAG AC TG AG A 115 0 

GCCAGCGGGT TGTGCTGCAG CTGAAGGGCC ACGTCAGCGA GCTGGAAGCA 12 00 

GATCTGGCCG AGCAGCAGCA CCTGCGGCAG CAGGCGGCCG ACGACTGTGA 12 5 0 

ATTCCTGCGG GCAGAACTGG ACGAGCTCAG GAGGCAGCGG G AGG AC AC C G 13 0 0 

AGAAGGCTCA GCGGAGCCTG TCTGAGATAG AAAGGAAAGC TCAAGCCAAT 13 5 0 

GAACAGCGAT ATAGCAAGCT AAAGGAGAAG TACAGCGAGC TGGTTCAGAA 14 00 

CCACGCTGAC CTGCTGCGGA AGAATGCAGA GGTGACCAAA CAGGTGTCCA 145 0 

TGGCCAGACA AGCCCAGGTA GATTTGGAAC GAGAGAAAAA AGAGCTGGAG 15 0 0 

GATTCGTTGG AGCGCATCAG TGACCAGGGC CAGCGGAAGA CTCAAGAACA 15 5 0 

GCTGGAAGTT C TAG AG AG C T TGAAGCAGGA ACTTGGCACA AGCCAACGGG 1600 

AGCTTCAGGT TCTGCAAGGC AGCCTGGAAA CTTCTGCCCA GTCAGAAGCA 165 0 

AACTGGGCAG CCGAGTTCGC CGAGCTAGAG AAGGAGCGGG ACAGCCTGGT 17 0 0 

GAGTGGCGCA GCTCATAGGG AGGAGGAATT ATCTGCTCTT CGGAAAGAAC 175 0 

TGCAGGACAC TCAGCTCAAA CTGGCCAGCA CAGAGGAATC TATGTGCCAG 180 0 

CTTGCCAAAG ACCAACGAAA AATGCTTCTG GTGGGGTCCA GGAAGGCTGC 18 5 0 

GGAGCAGGTG AT AC AAG AC G CCCTGAACCA GCTTGAAGAA CCTCCTCTCA 1900 

TCAGCTGCGC TGGGTCTGCA GATCACCTCC TCTCCACGGT CACATCCATT 195 0 

TCCAGCTGCA TCGAGCAACT GGAGAAAAGC TGGAGCCAGT ATCTGGCCTG 2 000 

CCCAGAAGAC ATCAGTGGAC TTCTCCATTC CATAACCCTG CTGGCCCACT 2 050 

TGACCAGCGA CGCCATTGCT CATGGTGCCA CCACCTGCCT CAGAGCCCCA 210 0 

CCTGAGCCTG CCGACTCACT G AC C G AGG C C TGTAAGCAGT ATGGCAGGGA 215 0 

AACCCTCGCC TACCTGGCCT CCCTGGAGGA AGAGGGAAGC CTTGAGAATG 22 0 0 

CCGACAGCAC AGCCATGAGG AACTGCCTGA GCAAGATCAA GGCCATCGGC 225 0 

GAGGAGCTCC TGCCCAGGGG ACTGGACATC AAGCAGGAGG AGCTGGGGGA 23 0 0 

CCTGGTGGAC AAGGAGATGG CGGCCACTTC AGCTGCTATT GAAACTTGCA 23 5 0 

CGGCCAGAAT AG AG G AG ATG CTCAGCAAAT CCCGAGCAGG AGACACAGGA 24 00 

GTCAAATTGG AGGTGAATGA AAGGATCCTT CGTTGCTGTA CCAGCCTCAT 2 45 0 

GCAAGCTATT CAGGTGCTCA TCGTGGCCTC TAAGGACCTC CAGAGAGAGA 25 00 

TTGTGGAGAG CGGCAGGGGT ACAGCATCCC CTAAAGAGTT TT ATG C C AAG 2 55 0 

AACTCTCGAT GGACAGAAGG ACTTATCTCA GCCTCCAAGG CTGTGGGCTG 2 600 

GGGAGCCACT GTCATGGTGG ATGCAGCTGA TCTGGTGGTA CAAGGCAGAG 2 65 0 

GGAAATTTGA GGAGCTAATG GTGTGTTCTC ATGAAATTGC TGCTAGCACA 27 00 

GCCCAGCTTG TGGCTGCATC CAAGGTGAAA GCTGATAAGG ACAGCCCCAA 27 50 

CCTAGCCCAG CTGCAGCAGG CCTCTCGGGG AGTGAACCAG GCCACTGCCG 28 00 

GCGTTGTGGC CTCAACCATT TCCGGCAAAT C AC AG ATC G A AGAGACAGAC 2 85 0 

AACATGGACT TCTCAAGCAT GACGCTGACA CAGATCAAAC GCCAAGAGAT 2 900 

GGATTCTCAG GTTAGGGTGC TAGAGCTAGA AAATGAATTG CAGAAGGAGC 295 0 

GTCAAAAACT GGGAGAGCTT CGGAAAAAGC ACTACGAGCT TGCTGGTGTT 3 000 

GCTGAGGGCT GGGAAGAAGG AACAGAGGCA TCTCCACCTA CACTGCAAGA 3 0 50 

AGTGGTAACC GAAAAAGAAT AG AG C C AAAC CAACACCCCA TATGTCAGTG 3100 

TAAATCCTTG TT AC C T ATC T CGTGTGTGTT ATTTCCCCAG CCACAGGCCA 315 0 

AATCCTTGGA GTCCCAGGGG CAGCCACACC ACTGCCATTA CCCAGTGCCG 3 2 00 

AGGACATGCA TGACACTTCC CAAAGATCCC TCCATAGCGA CACCCTTTCT 32 5 0 

GTTTGGACCC ATGGTCATCT CTGTTCTTTT CCCGCCTCCC TAGTTAGCAT 3 3 00 
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CCAGGCTGGC CAGTGCTGCC CATGAGCAAG CCTAGGTACG AAGAGGGGTG 3 350 

GTGGGGGGCA GGGGCACTCA ACAGAGAGGA CCAACATCCA GTCCTGCTGA 3 400 

CTATTTGACC CCCACAACAA TGGGTATCCT TAATAGAGGA GCTGCTTGTT 3 45 0 

GTTTGTTGAC AGCTTGGAAA GGGAAGATCT TATGCCTTTT CTTTTCTGTT 3 500 

TTCTTCTCAG TCTTTTCAGT TTCATCATTT GCACAAACTT GTGAGCATCA 3 550 

GAGGGCTGAT GGATTC C AAA C C AGG AC AC T AC C C TG AG AT CTGCACAGTC 3 600 

AG AAGG AC GG CAGGAGTGTC CTGGCTGTGA ATGCCAAAGC CATTCTCCCC 3 65 0 

CTCTTTGGGC AGTGCCATGG ATTTCCACTG CTTCTTATGG TGGTTGGTTG 3 7 00 

GGTTTTTTGG TTTTGTTTTT TTTTTTTAAG TTTCACTCAC ATAGCCAACT 3 7 50 

CTCCCAAAGG GCACACCCCT GGGGCTGAGT CTCCAGGGCC CCCCAACTGT 3 800 

GGTAGCTCCA GCGATGGTGC TGCCCAGGCC TCTCGGTGCT CCATCTCCGC 3 85 0 

CTCCACACTG ACCAAGTGCT GGCCCACCCA GTCCATGCTC CAGGGTCAGG 3 900 

CGGAGCTGCT GAGTGACAGC TTTCCTCAAA AAGCAGAAGG AGAGTGAGTG 3 95 0 

CCTTTCCCTC CTAAAGCTGA ATCCCGGCGG AAAGCCTCTG TCCGCCTTTA 4 000 

CAAGGGAGAA GACAACAGAA AGAGGGACAA GAGGGTTCAC ACAGCCCAGT 4 05 0 

TCCCGTGACG AGGCTCAAAA ACTTGATCAC ATGCTTGAAT GGAGCTGGTG 410 0 

AGATCAACAA CACTACTTCC CTGCCGGAAT GAACTGTCCG TG AATGGTC T 415 0 

CTGTCAAGCG GGCCGTCTCC CTTGGCCCAG AGACGGAGTG TGGGAGTGAT 4200 

TCCCAACTCC TTTCTGCAGA CGTCTGCCTT GG CATC C TC T TGAATAGGAA 425 0 

GATCGTTCCA CTTTCTACGC AATTGACAAA CCCGGAAGAT CAGATGCAAT 43 00 

TGCTCCCATC AGGGAAGAAC CCTATACTTG GTTTGCTACC CTTAGTATTT 43 5 0 

ATT AC T AAC C TCCCTTAAGC AGCAACAGCC TACAAAGAGA TGCTTGGAGC 4400 

AATCAGAACT TCAGGTGTGA CTCTAGCAAA GCTCATCTTT CTGCCCGGCT 445 0 

ACATCAGCCT TCAAGAATCA GAAGAAAGCC AAGGTGC TGG AC TGTT AC TG 4500 

ACTTGGATCC CAAAGCAAGG AGATCATTTG GAGCTCTTGG GTCAGAGAAA 455 0 

ATGAGAAAGG ACAGAGCCAG CGGCTCCAAC TCCTTTCAGC CACATGCCCC 460 0 

AGGCTCTCGC TGCCCTGTGG ACAGGATGAG GACAGAGGGC ACATGAACAG 465 0 

CTTGCCAGGG ATGGGCAGCC CAACAGCACT TTTCCTCTTC TAGATGGACC 47 00 

CCAGCATTTA AGTGACCTTC TGATCTTGGG AAAACAGCGT CTTCCTTCTT 47 5 0 

TATC T AT AG C AACTCATTGG TGGTAGCCAT CAAGCACTTC GGAATT 4796 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 924 

(B) TYPE: protein 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: Huntingtin-interacting protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Ser Arg Met Trp Gly His Leu Ser Glu Gly Tyr Gly Gin Leu 
15 10 15 

Cys Ser lie Tyr Leu Lys Leu Leu Arg Thr Lys Met Glu Tyr His 
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Thr Lys Asn Pro Arg Phe Pro Gly Asn Leu Gin Met Ser Asp Arg 

35 40 45 

Gin Leu Asp Glu Ala Gly Glu Ser Asp Val Asn Asn Phe Phe Gin 

50 55 60 

Leu Thr Val Glu Met Phe Asp Tyr Leu Glu Cys Glu Leu Asn Leu 

65 70 75 

Phe Gin Thr Val Phe Asn Ser Leu Asp Met Ser Arg Ser Val Ser 

80 85 90 

Val Thr Ala Ala Gly Gin Cys Arg Leu Ala Pro Leu lie Gin Val 

95 100 105 

lie Leu Asp Cys Ser His Leu Tyr Asp Tyr Thr Val Lys Leu Leu 

110 115 120 

Phe Lys Leu His Ser Cys Leu Pro Ala Asp Thr Leu Gin Gly His 

125 130 135 

Arg Asp Arg Phe Met Glu Gin Phe Thr Lys Leu Lys Asp Leu Phe 

140 145 150 

Tyr Arg Ser Ser Asn Leu Gin Tyr Phe Lys Arg Leu lie Gin lie 

155 160 165 

Pro Gin Leu Pro Glu Asn Pro Pro Asn Phe Leu Arg Ala Ser Ala 

170 175 180 

Leu Ser Glu His lie Ser Pro Val Val Val lie Pro Ala Glu Ala 

185 190 195 

Ser Ser Pro Asp Ser Glu Pro Val Leu Glu Lys Asp Asp Leu Met 

200 205 210 

Asp Met Asp Ala Ser Gin Gin Asn Leu Phe Asp Asn Lys Phe Asp 

215 220 225 

Asp lie Phe Gly Ser Ser Phe Ser Ser Asp Pro Phe Asn Phe Asn 

230 235 240 

Ser Gin Asn Gly Val Asn Lys Asp Glu Lys Asp His Leu lie Glu 

245 250 255 

Arg Leu Tyr Arg Glu lie Ser Gly Leu Lys Ala Gin Leu Glu Asn 

260 265 270 

Met Lys Thr Glu Ser Gin Arg Val Val Leu Gin Leu Lys Gly His 

275 280 285 
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Val Ser Glu Leu Glu Ala Asp Leu Ala Glu Gin Gin His Leu Arg 

290 295 300 

Gin Gin Ala Ala Asp Asp Cys Glu Phe Leu Arg Ala Glu Leu Asp 

305 310 315 

Glu Leu Arg Arg Gin Arg Glu Asp Thr Glu Lys Ala Gin Arg Ser 

320 325 330 

Leu Ser Glu He Glu Arg Lys Ala Gin Ala Asn Glu Gin Arg Tyr 

335 340 345 

Ser Lys Leu Lys Glu Lys Tyr Ser Glu Leu Val Gin Asn His Ala 

350 355 360 

Asp Leu Leu Arg Lys Asn Ala Glu Val Thr Lys Gin Val Ser Met 

365 370 375 

Ala Arg Gin Ala Gin Val Asp Leu Glu Arg Glu Lys Lys Glu Leu 

380 385 390 

Glu Asp Ser Leu Glu Arg He Ser Asp Gin Gly Gin Arg Lys Thr 

395 400 405 

Gin Glu Gin Leu Glu Val Leu Glu Ser Leu Lys Gin Glu Leu Gly 

410 415 420 

Thr Ser Gin Arg Glu Leu Gin Val Leu Gin Gly Ser Leu Glu Thr 

425 430 435 

Ser Ala Gin Ser Glu Ala Asn Trp Ala Ala Glu Phe Ala Glu Leu 

440 445 450 

Glu Lys Glu Arg Asp Ser Leu Val Ser Gly Ala Ala His Arg Glu 

455 460 465 

Glu Glu Leu Ser Ala Leu Arg Lys Glu Leu Gin Asp Thr Gin Leu 

470 475 480 

Lys Leu Ala Ser Thr Glu Glu Ser Met Cys Gin Leu Ala Lys Asp 

485 490 495 

Gin Arg Lys Met Leu Leu Val Gly Ser Arg Lys Ala Ala Glu Gin 

500 505 510 

Val He Gin Asp Ala Leu Asn Gin Leu Glu Glu Pro Pro Leu He 

515 520 525 

Ser Cys Ala Gly Ser Ala Asp His Leu Leu Ser Thr Val Thr Ser 

530 535 540 
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lie Ser Ser Cys lie Glu Gin Leu Glu Lys Ser Trp Ser Gin Tyr 

545 550 555 

Leu Ala Cys Pro Glu Asp lie Ser Gly Leu Leu His Ser lie Thr 

560 565 570 

Leu Leu Ala His Leu Thr Ser Asp Ala lie Ala His Gly Ala Thr 

575 580 585 

Thr Cys Leu Arg Ala Pro Pro Glu Pro Ala Asp Ser Leu Thr Glu 

590 595 600 

Ala Cys Lys Gin Tyr Gly Arg Glu Thr Leu Ala Tyr Leu Ala Ser 

605 610 615 

Leu Glu Glu Glu Gly Ser Leu Glu Asn Ala Asp Ser Thr Ala Met 

620 625 630 

Arg Asn Cys Leu Ser Lys lie Lys Ala lie Gly Glu Glu Leu Leu 

635 640 645 

Pro Arg Gly Leu Asp lie Lys Gin Glu Glu Leu Gly Asp Leu Val 

650 655 660 

Asp Lys Glu Met Ala Ala Thr Ser Ala Ala lie Glu Thr Cys Thr 

665 670 675 

Ala Arg lie Glu Glu Met Leu Ser Lys Ser Arg Ala Gly Asp Thr 

680 685 690 

Gly Val Lys Leu Glu Val Asn Glu Arg lie Leu Arg Cys Cys Thr 

695 700 705 

Ser Leu Met Gin Ala lie Gin Val Leu lie Val Ala Ser Lys Asp 

710 715 720 

Leu Gin Arg Glu lie Val Glu Ser Gly Arg Gly Thr Ala Ser Pro 

725 730 735 

Lys Glu Phe Tyr Ala Lys Asn Ser Arg Trp Thr Glu Gly Leu lie 

740 745 750 

Ser Ala Ser Lys Ala Val Gly Trp Gly Ala Thr Val Met Val Asp 

765 770 775 

Ala Ala Asp Leu Val Val Gin Gly Arg Gly Lys Phe Glu Glu Leu 

780 785 790 

Met Val Cys Ser His Glu lie Ala Ala Ser Thr Ala Gin Leu Val 

795 800 805 
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Ala Ala Ser Lys Val Lys Ala Asp Lys Asp Ser Pro Asn Leu Ala 

810 815 820 



Gin Leu Gin Gin Ala Ser Arg Gly Val Asn Gin Ala Thr Ala Gly 

825 830 835 

Val Val Ala Ser Thr lie Ser Gly Lys Ser Gin lie Glu Glu Thr 

840 845 850 

Asp Asn Met Asp Phe Ser Ser Met Thr Leu Thr Gin lie Lys Arg 

855 860 865 

Gin Glu Met Asp Ser Gin Val Arg Val Leu Glu Leu Glu Asn Glu 

870 875 880 

Leu Gin Lys Glu Arg Gin Lys Leu Gly Glu Leu Arg Lys Lys His 

885 890 895 

Tyr Glu Leu Ala Gly Val Ala Glu Gly Trp Glu Glu Gly Thr Glu 

900 905 910 

Ala Ser Pro Pro Thr Leu Gin Glu Val Val Thr Glu Lys Glu 

915 920 924 



(2) INFORMATION FOR SEQ ID NO: 5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1090 

(B) TYPE: protein 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: Huntingtin-interacting protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met Leu Leu Cys Gin Gly Ser Glu Trp Arg Arg Asp Gin Gin Leu 

5 10 15 

Gly Thr Ala Asn Ala Arg Gin Trp Cys Pro Leu Pro Gin Asp Ala 

20 25 30 

Gin Pro Ala Gly Ser Trp Glu Arg Cys Pro Pro Leu Pro Pro Ala 

35 40 45 

Gly Arg Leu Gin Gly Thr Asp His Pro Trp Gly Trp Gly Arg Leu 

50 55 60 
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Ala Gly Gly Gly Glu Arg Gly Gly Leu Trp Glu Gly Leu Ser His 

65 70 75 

Ser Gin Arg Leu lie His Leu lie Leu Leu Ser Leu Pro Leu Leu 

80 85 90 

Val Phe Gin Thr Val Ser lie Asn Lys Ala lie Asn Thr Gin Glu 

95 100 105 

Val Ala Val Lys Glu Lys His Ala Arg Thr Cys lie Leu Gly Thr 

110 115 120 

His His Glu Lys Gly Ala Gin Thr Phe Trp Ser Val Val Asn Arg 

125 130 135 

Leu Pro Leu Ser Ser Asn Ala Val Leu Cys Trp Lys Phe Cys His 

140 145 150 

Val Phe His Lys Leu Leu Arg Asp Gly His Pro Asn Val Leu Lys 

155 160 165 

Asp Ser Leu Arg Tyr Arg Asn Glu Leu Ser Asp Met Ser Arg Met 

170 175 180 

Trp Gly His Leu Ser Glu Gly Tyr Gly Gin Leu Cys Ser lie Tyr 

185 190 195 

Leu Lys Leu Leu Arg Thr Lys Met Glu Tyr His Thr Lys Asn Pro 

200 205 210 

Arg Phe Pro Gly Asn Leu Gin Met Ser Asp Arg Gin Leu Asp Glu 

215 220 225 

Ala Gly Glu Ser Asp Val Asn Asn Phe Phe Gin Leu Thr Val Glu 

230 235 240 

Met Phe Asp Tyr Leu Glu Cys Glu Leu Asn Leu Phe Gin Thr Val 

245 250 255 

Phe Asn Ser Leu Asp Met Ser Arg Ser Val Ser Val Thr Ala Ala 

260 265 270 

Gly Gin Cys Arg Leu Ala Pro Leu lie Gin Val lie Leu Asp Cys 

275 288 285 



Ser His Leu Tyr Asp Tyr Thr Val Lys Leu Leu Phe Lys Leu His 

290 295 300 

Ser Cys Leu Pro Ala Asp Thr Leu Gin Gly His Arg Asp Arg Phe 

305 310 315 
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Met Glu Gin Phe Thr Lys Leu Lys Asp Leu Phe Tyr Arg Ser Ser 

320 325 330 

Asn Leu Gin Tyr Phe Lys Arg Leu lie Gin lie Pro Gin Leu Pro 

335 340 345 

Glu Asn Pro Pro Asn Phe Leu Arg Ala Ser Ala Leu Ser Glu His 

350 355 360 

lie Ser Pro Val Val Val lie Pro Ala Glu Ala Ser Ser Pro Asp 

365 370 375 

Ser Glu Pro Val Leu Glu Lys Asp Asp Leu Met Asp Met Asp Ala 

380 385 390 

Ser Gin Gin Asn Leu Phe Asp Asn Lys Phe Asp Asp lie Phe Gly 

395 400 405 

Ser Ser Phe Ser Ser Asp Pro Phe Asn Phe Asn Ser Gin Asn Gly 

410 415 420 

Val Asn Lys Asp Glu Lys Asp His Leu He Glu Arg Leu Tyr Arg 

425 430 435 

Glu lie Ser Gly Leu Lys Ala Gin Leu Glu Asn Met Lys Thr Glu 

440 445 450 

Ser Gin Arg Val Val Leu Gin Leu Lys Gly His Val Ser Glu Leu 

455 460 465 

Glu Ala Asp Leu Ala Glu Gin Gin His Leu Arg Gin Gin Ala Ala 

470 475 480 

Asp Asp Cys Glu Phe Leu Arg Ala Glu Leu Asp Glu Leu Arg Arg 

485 490 495 

Gin Arg Glu Asp Thr Glu Lys Ala Gin Arg Ser Leu Ser Glu He 

500 505 510 

Glu Arg Lys Ala Gin Ala Asn Glu Gin Arg Tyr Ser Lys Leu Lys 

515 520 525 

Glu Lys Tyr Ser Glu Leu Val Gin Asn His Ala Asp Leu Leu Arg 

530 535 540 

Lys Asn Ala Glu Val Thr Lys Gin Val Ser Met Ala Arg Gin Ala 

545 550 555 

Gin Val Asp Leu Glu Arg Glu Lys Lys Glu Leu Glu Asp Ser Leu 

560 565 570 
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Glu Arg lie Ser Asp Gin Gly Gin Arg Lys Thr Gin Glu Gin Leu 

575 588 585 

Glu Val Leu Glu Ser Leu Lys Gin Glu Leu Ala Thr Ser Gin Arg 

590 595 600 

Glu Leu Gin Val Leu Gin Gly Ser Leu Glu Thr Ser Ala Gin Ser 

605 610 615 

Glu Ala Asn Trp Ala Ala Glu Phe Ala Glu Leu Glu Lys Glu Arg 

620 625 630 

Asp Ser Leu Val Ser Gly Ala Ala His Arg Glu Glu Glu Leu Ser 

635 640 645 

Ala Leu Arg Lys Glu Leu Gin Asp Thr Gin Leu Lys Leu Ala Ser 

650 655 660 

Thr Glu Glu Ser Met Cys Gin Leu Ala Lys Asp Gin Arg Lys Met 

665 670 675 

Leu Leu Val Gly Ser Arg Lys Ala Ala Glu Gin Val lie Gin Asp 

680 685 690 

Ala Leu Asn Gin Leu Glu Glu Pro Pro Leu lie Ser Cys Ala Gly 

695 700 705 

Ser Ala Asp His Leu Leu Ser Thr Val Thr Ser lie Ser Ser Cys 

710 715 720 

lie Glu Gin Leu Glu Lys Ser Trp Ser Gin Tyr Leu Ala Cys Pro 

725 730 735 

Glu Asp lie Ser Gly Leu Leu His Ser lie Thr Leu Leu Ala His 

740 745 750 

Leu Thr Ser Asp Ala lie Ala His Gly Ala Thr Thr Cys Leu Arg 

755 760 765 

Ala Pro Pro Glu Pro Ala Asp Ser Leu Thr Glu Ala Cys Lys Gin 

770 775 780 

Tyr Gly Arg Glu Thr Leu Ala Tyr Leu Ala Ser Leu Glu Glu Glu 

785 790 795 

Gly Ser Leu Glu Asn Ala Asp Ser Thr Ala Met Arg Asn Cys Leu 

800 805 810 

Ser Lys lie Lys Ala lie Gly Glu Glu Leu Leu Pro Arg Gly Leu 

815 820 825 
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Asp lie Lys Gin Glu Glu Leu Gly Asp Leu Val Asp Lys Glu Met 

830 835 840 

Ala Ala Thr Ser Ala Ala lie Glu Thr Ala Thr Ala Arg lie Glu 

845 850 855 

Glu Met Leu Ser Lys Ser Arg Ala Gly Asp Thr Gly Val Lys Leu 

860 865 870 

Glu Val Asn Glu Arg lie Leu Gly Cys Cys Thr Ser Leu Met Gin 

875 888 885 

Ala lie Gin Val Leu lie Val Ala Ser Lys Asp Leu Gin Arg Glu 

890 895 900 

lie Val Glu Ser Gly Arg Gly Thr Ala Ser Pro Lys Glu Phe Tyr 

905 910 915 

Ala Lys Asn Ser Arg Trp Thr Glu Gly Leu lie Ser Ala Ser Lys 

920 925 930 

Ala Val Gly Trp Gly Ala Thr Val Met Val Asp Ala Ala Asp Leu 

935 940 945 

Val Val Gin Gly Arg Gly Lys Phe Glu Glu Leu Met Val Cys Ser 

950 955 960 

His Glu lie Ala Ala Ser Thr Ala Gin Leu Val Ala Ala Ser Lys 

965 970 975 

Val Lys Ala Asp Lys Asp Ser Pro Asn Leu Ala Gin Leu Gin Gin 

980 985 990 

Ala Ser Arg Gly Val Asn Gin Ala Thr Ala Gly Val Val Ala Ser 

995 1000 1005 

Thr lie Ser Gly Lys Ser Gin lie Glu Glu Thr Asp Asn Met Asp 

1010 1015 1020 

Phe Ser Ser Met Thr Leu Thr Gin lie Lys Arg Gin Glu Met Asp 

1025 1030 1035 

Ser Gin Val Arg Val Leu Glu Leu Glu Asn Glu Leu Gin Lys Glu 

1040 1045 1050 

Arg Gin Lys Leu Gly Glu Leu Arg Lys Lys His Tyr Glu Leu Ala 

1055 1060 1065 

Gly Val Ala Glu Gly Trp Glu Glu Gly Thr Glu Ala Ser Pro Pro 

1070 1075 1080 
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Thr Leu Gin Glu Val Val Thr Glu Lys Glu 

1085 1090 



(2) INFORMATION FOR SEQ ID NO:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3301 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: cDNA for Huntingtin-interacting protein 
(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



CGGTGAGCTG GAGGAGCAGC GGAAGCAGAA GCAGAAGGCC C TGGTGG AT A 5 0 

ATGAGCAGCT CCGCCACGAG CTGGCCCAGC TGAGGGCTGC CCAGCTGGAG 100 

CGCGAGCGGA GCCAGGGCCT GCGTGAGGAG GCTGAGAGGA AGGCCAGTGC 15 0 

CACGGAGGCG CGCTACAACA AGCTGAAGGA AAAGCACAGT GAGCTCGTCC 2 00 

ATGTGCACGC GGAGCTGCTC AGAAAGAACG CGGACACAGC CAAGCAGCTG 25 0 

ACGGTGACGC AGCAAAGCCA GGAGGAGGTG GCGCGGGTGA AGGAGCAGCT 3 00 

GGCCTTCCAG GTGGAGCAGG TGAAGCGGGA GTCGGAGTTG AAGCTAGAGG 3 50 

AG AAG AG C G A CCAGCAGGAG AAGCTCAAGA GGGAGCTGGA GGCCAAGGCC 400 

GGAGAGCTGG CCCGCGCGCA GGAGGCCCTG AGCCACACAG AGCAGAGCAA 45 0 

GTCGGAGCTG AGCTCACGGC TGGACACACT GAGTGCGGAG AAGGATGCTC 5 00 

TGAGTGGAGC TGTGCGGGAG CGGGAGGCAG ACCTGCTGGC GGCGCAGAGC 55 0 

CTGGTGCGCG AGACAGAGGC GGCGCTGAGC CGGGAGCAGC AGCGCAGCTC 600 

CCAGGAGCAG GGCGAGTTGC AGGGCCGGCT GGCAGAGAGG GAGTCTCAGG 650 

AGCAGGGGCT GCGGCAGAGG CTGCTGGACG AGCAGTTCGC AGTGTTGCGG 7 00 

GGCGCTGCTG CCGAGGCCGC GGGCATCCTG CAGGATGCCG TGAGCAAGCT 75 0 

GGACGACCCC CTGCACCTGC GCTGTACCAG CTCCCCAGAC TACCTGGTGA 800 

GCAGGGCCCA GGAGGCCTTG GATGCCGTGA GCACCCTGGA GGAGGGCCAC 85 0 

GCCCAGTACC TGACCTCCTT GGCAGACGCC TCCGCCCTGG TGGCAGCTCT 900 

GACCCGCTTC TCCCACCTGG CTGCGGATAC CATCATCAAT GGCGGTGCCA 95 0 

CCTCGCACCT GGCTCCCACC GACCCTGCCG ACCGCCTCAT AGACACCTGC 1000 

AGGGAGTGCG GGGCCCGGGC TCTGGAGCTC ATGGGGCAGC TGCAGGACCA 1050 

GCAGGCTCTG CGGCACATGC AGGCCAGCCT GGTGCGGACA CCCCTGCAGG 1100 

GCATCGTTCA GCTGGGCCAA GAACTGAAAC CCAAGAGCCT AGATGTGCGG 1150 

CAGGAGGAGC TGGGGGCCGT GGTCGACAAG GAGATGGCGG CCACATCCGC 12 00 

AGCCATTGAA GATGCTGTGC GGAGGATTGA GGACATGATG AACCAGGCAC 12 50 

GCCACGCCAG CTCGGGGGTG AAGCTGGAGG TGAACGAGAG GATCCTCAAC 13 00 

TCCTGCACAG ACCTGATGAA GGCTATCCGG CTCCTGGTGA CGACATCCAC 13 5 0 

TAGCCTGCAG AAGGAGATCG TGGAGAGCGG CAGGGGGGCA GCCACGCAGC 14 00 

AGGAATTTTA CGGCAAGAAC TCGCGCTGGA CCGAAGGCCT CATCTCGGCC 1450 

TCCAAGGCTG TGGGCTGGGG AGCCACACAG CTGGTGGAGG CAGCTGACAA 15 00 

GGTGGTGCTT CACACGGGCA AGTATGAGGA GCTCATCGTC TGCTCCCACG 155 0 

AGATCGCAGC CAGCACGGCC CAGCTGGTGG CGGCCTCCAA GGTGAAGGCC 1600 
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AACAAGCACA GCCCCCACCT GAGCCGCCTG CAGGAATGTT CTCGCACAGT 1650 

CAATGAGAGG GCTGCCAATG TGGTGGCCTC CACCAAGTCA GGCCAGGAGC 17 00 

AGATTGAGGA CAGAGACACC ATGGATTTCT CCGGCCTGTC CCTCATCAAG 17 50 

CTGAAGAAGC AGGAGATGGA GACGCAGGTG CGTGTCCTGG AGCTGGAGAA 18 00 

GACGCTGGAG GCTGAACGCA TGCGGCTGGG GGAGTTGCGG AAGCAACACT 1850 

ACGTGCTGGC TGGGGCATCA GGCAGCCCTG GAGAGGAGGT GGCCATCCGG 19 00 

CCCAGCACTG CCCCCCGAAG TGTAACCACC AAGAAAC C AC CCCTGGCCCA 19 50 

GAAGCCCAGC GTGGCCCCCA GACAGGACCA CCAGCTTGAC AAAAAGGATG 2 000 

GCATCTACCC AGCTCAACTC GTGAACTACT AGGCCCCCCA GGGGTCCAGC 2 050 

AGGGTGGCTG GTGACAGGCC TGGGCCTCTG CAACTGCCCT G AC AGG AC CG 2100 

AGAGGCCTTG CCCCTCCACC TGGTGCCCAA GCCTCCCGCC CCACCGTCTG 215 0 

GATCAATGTC CTCAAGGCCC CTGGCCCTTA CTGAGCCTGC AGGGTCCTGG 22 00 

GCCATGTGGG TGGTGCTTCT GGATGTGAGT CTCTTATTTA TCTGCAGAAG 22 5 0 

GAACTTTGGG GTGCAGCCAG GACCCGGTAG GCCTGAGCCT CAACTCTTCA 23 00 

GAAAATAGTG TTTTTAATAT TCCTCTTCAG AAAATAGTGT TTTTAATATT 23 50 

CCGAGCTAGA GCTCTTCTTC CTACGTTTGT AGTCAGCACA CTGGGAAACC 24 00 

GGGCCAGCGT GGGGCTCCCT GCCTTCTGGA CTCCTGAAGG TCGTGGATGG 2450 

ATGGAAGGCA CACAGCCCGT GCCGGCTGAT GGGACGAGGG TCAGGCATCC 2 5 00 

TGTCTGTGGC CTTCTGGGGC ACCGATTCTA CCAGGCCCTC CAGCTGCGTG 25 50 

GTCTCCGCAG ACCAGGCTCT GTGTGGGCTA GAGGAATGTC GCCCATTACC 26 00 

TCCTCAGGCC CTGGCCCTCG GGCCTCCGTG ATGGGAGCCC CCCAGGAGGG 27 00 

GTCAGATGCT GGAAGGGGCC GCTTTCTGGG GAGTGAGGTG AG AC AT AGC G 27 50 

GCCCAGGCGC TGCCTTCACT CCTGGAGTTT CCATTTCCAG CTGGAATCTG 2800 

CAGCCACCCC CATTTCCTGT TTTCCATTCC CCCGTTCTGG CCGCGCCCCA 2850 

CTGCCCACCT GAAGGGGTGG TTTCCAGCCC TCCGGAGAGT GGGCTTGGCC 2900 

CTAGGCCCTC CAGCTCAGCC AGAAAAAGCC CAGAAACCCA GGTGCTGGAC 29 5 0 

CAGGGCCCTC AGGGAGGGAC CCTGCGGCTA GAGTGGGCTA GGCCCTGGCT 3000 

TTGCCCGTCA GATTTGAACG AATGTGTGTC CCTTGAGCCC AAGGAGAGCG 3050 

GCAGGAGGGG TGGGACCAGG CTGGGAGGAC AGAGCCAGCA GCTGCCATGC 3100 

CCTCCTGCTC CCCCCACCCC AGCCCTAGCC CTTTAGCCTT TCACCCTGTG 3150 

CTCTGGAAAG GCTACCAAAT ACTGGCCAAG GTCAGGAGGA GCAAAAATGA 32 00 

GCCAGCACCA GCGCCTTGGC TTTGTGTTAG CATTTCCTCC TGAAGTGTTC 32 50 

TGTTGGCAAT AAAATGCACT TTGACTGTTA AAAAAAAAAA AAAAAAAAAA 33 00 

A 3301 



(2) INFORMATION FOR SEQ ID NO: 7 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 676 

(B) TYPE: protein 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: Huntingtin-interacting protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Gly Glu Leu Glu Glu Gin Arg Lys Gin Lys Gin Lys Ala Leu Val 
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Asp Asn Glu Gin Leu Arg His Glu Leu Ala Gin Leu Arg Ala Ala 

20 25 30 

Gin Leu Glu Arg Glu Arg Ser Gin Gly Leu Arg Glu Glu Ala Glu 

35 40 45 

Arg Lys Ala Ser Ala Thr Glu Ala Arg Tyr Asn Lys Leu Lys Glu 

50 55 60 

Lys His Ser Glu Leu Val His Val His Ala Glu Leu Leu Arg Lys 

65 70 75 

Asn Ala Asp Thr Ala Lys Gin Leu Thr Val Thr Gin Gin Ser Gin 

80 85 90 

Glu Glu Val Ala Arg Val Lys Glu Gin Leu Ala Phe Gin Val Glu 

95 100 105 

Gin Val Lys Arg Glu Ser Glu Leu Lys Leu Glu Glu Lys Ser Asp 

110 115 120 

Gin Gin Glu Lys Leu Lys Arg Glu Leu Glu Ala Lys Ala Gly Glu 

125 130 135 

Leu Ala Arg Ala Gin Glu Ala Leu Ser His Thr Glu Gin Ser Lys 

140 145 150 

Ser Glu Leu Ser Ser Arg Leu Asp Thr Leu Ser Ala Glu Lys Asp 

155 160 165 

Ala Leu Ser Gly Ala Val Arg Gin Arg Glu Ala Asp Leu Leu Ala 

170 175 180 

Ala Gin Ser Leu Val Arg Glu Thr Glu Ala Ala Leu Ser Arg Glu 

185 190 195 

Gin Gin Arg Ser Ser Gin Glu Gin Gly Glu Leu Gin Gly Arg Leu 

200 205 210 

Ala Glu Arg Glu Ser Gin Glu Gin Gly Leu Arg Gin Arg Leu Leu 

215 220 225 

Asp Glu Gin Phe Ala Val Leu Arg Gly Ala Ala Ala Glu Ala Ala 

230 235 240 

Gly lie Leu Gin Asp Ala Val Ser Lys Leu Asp Asp Pro Leu His 

245 250 255 

Leu Arg Cys Thr Ser Ser Pro Asp Tyr Leu Val Ser Arg Ala Gin 

260 265 270 
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Glu Ala Leu Asp Ala Val Ser Thr Leu Glu Glu Gly His Ala Gin 

275 288 285 

Tyr Leu Thr Ser Leu Ala Asp Ala Ser Ala Leu Val Ala Ala Leu 

290 295 300 

Thr Arg Phe Ser His Leu Ala Ala Asp Thr lie lie Asn Gly Gly 

305 310 315 

Ala Thr Ser His Leu Ala Pro Thr Asp Pro Ala Asp Arg Leu lie 

320 325 330 

Asp Thr Cys Arg Glu Cys Gly Ala Arg Ala Leu Glu Leu Met Gly 

335 340 345 

Gin Leu Gin Asp Gin Gin Ala Leu Arg His Met Gin Ala Ser Leu 

350 355 360 

Val Arg Thr Pro Leu Gin Gly lie Leu Gin Leu Gly Gin Glu Leu 

365 370 375 

Lys Pro Lys Ser Leu Asp Val Arg Gin Glu Glu Leu Gly Ala Val 

380 385 390 

Val Asp Lys Glu Met Ala Ala Thr Ser Ala Ala lie Glu Asp Ala 

395 400 405 

Val Arg Arg lie Glu Asp Met Met Asn Gin Ala Arg His Ala Ser 

410 415 420 

Ser Gly Val Lys Leu Glu Val Asn Glu Arg lie Leu Asn Ser Cys 

425 430 435 

Thr Asp Leu Met Lys Ala lie Arg Leu Leu Val Thr Thr Ser Thr 

440 445 450 

Ser Leu Gin Lys Glu lie Val Glu Ser Gly Arg Gly Ala Ala Thr 

455 460 465 

Gin Gin Glu Phe Tyr Ala Lys Asn Ser Arg Trp Thr Glu Gly Leu 

470 475 480 

lie Ser Ala Ser Lys Ala Val Gly Trp Gly Ala Thr Gin Leu Val 

485 490 495 

Glu Ala Ala Asp Lys Val Val Leu His Thr Gly Lys Tyr Glu Glu 

500 505 510 

Leu lie Val Cys Ser His Glu lie Ala Ala Ser Thr Ala Gin Leu 

515 520 525 
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Val Ala Ala Ser Lys Val Lys Ala Asn Lys His Ser Pro His Leu 

530 535 540 

Ser Arg Leu Gin Glu Cys Ser Arg Thr Val Asn Glu Arg Ala Ala 

545 550 555 

Asn Val Val Ala Ser Thr Lys Ser Gly Gin Glu Gin lie Glu Asp 

560 565 570 

Arg Asp Thr Met Asp Phe Ser Gly Leu Ser Leu lie Lys Leu Lys 

575 588 585 

Lys Gin Glu Met Glu Thr Gin Val Arg Val Leu Glu Leu Glu Lys 

590 595 600 

Thr Leu Glu Ala Glu Arg Met Arg Leu Gly Glu Leu Arg Lys Gin 

605 610 615 

His Tyr Val Leu Ala Gly Ala Ser Gly Ser Pro Gly Glu Glu Val 

620 625 630 

Ala lie Arg Pro Ser Thr Ala Pro Arg Ser Val Thr Thr Lys Lys 

635 640 645 

Pro Pro Leu Ala Gin Lys Pro Ser Val Ala Pro Arg Gin Asp His 

650 655 660 

Gin Leu Asp Lys Lys Asp Gly lie Tyr Pro Ala Gin Leu Val Asn 

665 670 675 



Tyr 



(2) INFORMATION FOR SEQ ID NO:8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2338 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: mouse 

(ix) FEATURE: cDNA for Huntingtin-interacting protein - mHIPl 
(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GGCACGAGGG CTCATTCAGA TCCCCCAGCT GCCCGAGAAT CCACCCAACTT 5 0 
CCTACGAGCC TCGGCCCTGT CAGAGCACAT CAGTCCTGTG GTGGTGATCCC 100 
GGCAGAGGTG TCATCCCCAG ACAGTGAGCC TGTCCTGGAG AAGGATGACCT 15 0 
CATGGACATG GACGCCTCCC AGCAGACTTT GTTTGACAAC AAGTTTGATGA 2 00 
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CGTCTTTGGC AGCTCATTGA GCAGCGACCC TTTCAATTTC AACAATCAAAA 2 50 
TGGCGTGAAC AAGGAC GAGA AGGACCACTT GATTGAACGC CTGTACAGAGA 3 00 
GATCAGTGGA CTGACAGGGC AGCTGGACAA CATGAAGATT GAGAGCCAGCG 35 0 
GGCCATGCTG CAGCTGAAGG GTCGAGTGAG TGAGCTGGAG G C AG AG C TAG C 4 00 
AGAGCAGCAG CACTTGGGCC GGCAGGCTAT GGATGACTGC GAGTTCCTGCG 450 
CACTGAGCTG GATGAACTGA AGAGGCAGCG AG AGG AC AC G GAGAAGGCACA 500 
GCGCAGCCTG ACTGAGATAG AAAGAAAGGC CCAGGCTAAT GAACAGAGGTA 550 
TAGCAAGTTA AAAGAGAAGT ACAGTGAACT GGTGCAGAAC CATGCTGACCT 600 
GCTGCGGAAG AACGCAGAGG TGACCAAACA GGTGTCCGTG GCCCGGCAAGC 650 
CCAGGTGGAT TTGGAAAGAG AGAAAAAAGA GCTAGCAGAT TCCTTTGCAC 7 00 
GTGTAAGTGA CCAGGCCCAG CGGAAGACTC AAGAGCAACA GGATGTTCTA 75 0 
GAGAACCTGA AGCATGAACT GGCCACCAGC AGACAGGAGC TGCAGGTCCT 800 
CCACAGCAAC CTGGAAACCT CTGCCCAGTC AGAAGCGAAA TGGCTGACAC 850 
AGATCGCCGA GTTGGAGAAG GAACAAGGCA GCTTGGCGAC TGTTGCAGCT 90 0 
CAGAGAGAGG AAGAGTTATC AGCCCTCCGA GACCAGCTGG AAAGC AC CCA 950 
GATCAAGCTG GCTGGGGCCC AGGAATCCAT GTGCCAGCAG GTGAAGGACC 10 00 
AGAGGAAAAC CCTCTTGGCA GGGATCAGGA AGGCTGCGGA GCGTGAGATA 105 0 
CAGGAGGCGC TGAGCCAGCT TGAGG AAC C C ACCCTCATCA GCTGTGCAGG 1100 
ATCCACAGAT CACCTTCTCT CCAAAGTCAG CTCCGTTTCC AGCTGCCTCG 115 0 
AGCAACTGGA AAAGAACGGC AGCCAGTATC TGGCCTGCCC AGAAGATATT 12 00 
AGTGAGCTTC TGCACTCGAT CACCCTGCTT GCCCACTTGA CCGGTGACAC 12 5 0 
TGTCATCCAG GGGAGTGCCA CCAGCCTCCG GGCCCCACCG GAGCCAGCCG 13 0 0 
ACTCGTTGAC GGAGGCCTGT AGGCAGTATG GCAGAGAAAC CCTGGCCTAT 13 5 0 
CTGTCCTCCC TGGAGGAAGA GGGAACTGTG GAGAATGCTG ACGTCACAGC 1400 
CCTTAGGAAT TGCCTCAGCA GGGTCAAGAC CCTTGGCGAG GAGCTGCTGC 14 50 
CCAGGGGCCT GGACATCAAG CAGGAAGAGC TGGGTGACCT GGTGGACAAG 15 00 
GAGATGGCAG CCACTTCAGC TGCCATTGAA GCTGCCACCA CCCGGA TAG A 155 0 
GGAAATTCTC AGTAAGTCCC GAGCAGGAGA CACGGGAGTC AAGCTGGAGG 16 00 
TGAATGAGAG GATCCTGGGT TCCTGTACCA GCCTGATGCA GGCCATCAAG 165 0 
GTGCTCGTTG TGGCCTCCAA GGACCTCCAG AAGGAGATAG TGGAGAGTGG 17 0 0 
CAGGGGTAGT GCATCCCCTA AAGAATTTTA CGCCAAGAAC TCTCGGTGGA 17 5 0 
CGGAAGGGCT GATATCCGCC TCCAAAGCTG TTGGTTGGGG AGC T AC CAT C 18 00 
ATGGTGGATG CTGCTGATCT TGTGGTCCAA GGCAAAGGGA AGTTCGAGGA 185 0 
GCTGATGGTG TGTTCACGCG AGATTGCTGC CAGTACTGCC CAGCTCGTGG 19 00 
CTGCATCCAA GGTGAAAGCG AACAAGGGCA GCCTCAATCT GACCCAGCTG 2000 
CAGCAGGCCT CTCGAGGAGT GAACCAGGCC ACAGCCGCTG TGGTGGCCTC 2 050 
AACCATTTCT GGCAAATCTC AGATTGAGGA AACAGACAGT ATGGACTTCT 2100 
CAAGCATGAC ACTGACCCAG ATCAAGCGCC AGGAGATGGA TTCCCAGGTT 215 0 
AGGGTGCTGG AGCTGGAAAA TGACCTGCAG AAGGAGCGTC AGAAACTAGG 22 00 
AGAGCTACGG AAGAAACACT ACGAGCTGGA GGGCGTGGCT GAGGGCTGGG 22 5 0 
AGGAAGGGAC AGAAGCATCA CCGTCTACTG TCCAAGAAGC AATACCGGAC 2 3 00 
AAAGAGTAGA GCCAAGCCGA CACCCCACAC ATCAGAAA 2 33 8 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 676 

(B) TYPE: protein 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(iii) HYPOTHETICAL; no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: mouse 
(ix) FEATURE: Huntingtin-interacting protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Ala Arg Gly Leu lie Gin lie Pro Gin Leu Pro Glu Asn Pro Pro 

5 10 15 

Asn Phe Leu Arg Ala Ser Ala Leu Ser Glu His lie Ser Pro Val 

20 25 30 

Val Val He Pro Ala Glu Val Ser Ser Pro Asp Ser Glu Pro Val 

35 40 45 

Leu Glu Lys Asp Asp Leu Met Asp Met Asp Ala Ser Gin Gin Thr 

50 55 60 

Leu Phe Asp Asn Lys Phe Asp Asp Val Phe Gly Ser Ser Leu Ser 

65 70 75 

Ser Asp Pro Phe Asn Phe Asn Asn Gin Asn Gly Val Asn Lys Asp 

80 85 90 

Glu Lys Asp His Leu He Glu Arg Leu Tyr Arg Glu He Ser Gly 

95 100 105 

Leu Thr Gly Gin Leu Asp Asn Met Lys He Glu Ser Gin Arg Ala 

110 115 120 

Met Leu Gin Leu Lys Gly Arg Val Ser Glu Leu Glu Ala Glu Leu 

125 130 135 

Ala Glu Gin Gin His Leu Gly Arg Gin Ala Met Asp Asp Cys Glu 

140 145 150 

Phe Leu Arg Thr Glu Leu Asp Glu Leu Lys Arg Gin Arg Glu Asp 

155 160 165 

Thr Glu Lys Ala Gin Arg Ser Leu Thr Glu He Glu Arg Lys Ala 

170 175 180 

Gin Ala Asn Glu Gin Arg Tyr Ser Lys Leu Lys Glu Lys Tyr Ser 

185 190 195 

Glu Leu Val Gin Asn His Ala Asp Leu Leu Arg Lys Asn Ala Glu 

200 205 210 

Val Thr Lys Gin Val Ser Val Ala Arg Gin Ala Gin Val Asp Leu 

215 220 225 

Glu Arg Glu Lys Lys Glu Leu Ala Asp Ser Phe Ala Arg Val Ser 
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230 



235 



240 



Asp Gin Ala Gin Arg Lys Thr Gin Glu Gin Gin Asp Val Leu Glu 

245 250 255 

Asn Leu Lys His Glu Leu Ala Thr Ser Arg Gin Glu Leu Gin Val 

260 265 270 

Leu His Ser Asn Leu Glu Thr Ser Ala Gin Ser Glu Ala Lys Trp 

275 288 285 

Leu Thr Gin He Ala Glu Leu Glu Lys Glu Gin Gly Ser Leu Ala 

290 295 300 

Thr Val Ala Ala Gin Arg Glu Glu Glu Leu Ser Ala Leu Arg Asp 

305 310 315 

Gin Leu Glu Ser Thr Gin He Lys Leu Ala Gly Ala Gin Glu Ser 

320 325 330 

Met Cys Gin Gin Val Lys Asp Gin Arg Lys Thr Leu Leu Ala Gly 

335 340 345 

He Arg Lys Ala Ala Glu Arg Glu He Gin Glu Ala Leu Ser Gin 

350 355 360 

Leu Glu Glu Pro Thr Leu He Ser Cys Ala Gly Ser Thr Asp His 

365 370 375 

Leu Leu Ser Lys Val Ser Ser Val Ser Ser Cys Leu Glu Gin Leu 

380 385 390 

Glu Lys Asn Gly Ser Gin Tyr Leu Ala Cys Pro Glu Asp He Ser 

395 400 405 

Glu Leu Leu His Ser He Thr Leu Leu Ala His Leu Thr Gly Asp 

410 415 420 

Thr Val He Gin Gly Ser Ala Thr Ser Leu Arg Ala Pro Pro Glu 

425 430 435 

Pro Ala Asp Ser Leu Thr Glu Ala Cys Arg Gin Tyr Gly Arg Glu 

440 445 450 

Thr Leu Ala Tyr Leu Ser Ser Leu Glu Glu Glu Gly Thr Val Glu 



Asn Ala Asp Val Thr Ala Leu Arg Asn Cys Leu Ser Arg Val Lys 
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465 



470 



475 



480 



22 



• 



# 



WO 99/60986 



PCT7US99/11743 



Thr Leu Gly Glu Glu Leu Leu Pro Arg Gly Leu Asp lie Lys Gin 

485 490 495 

Glu Glu Leu Gly Asp Leu Val Asp Lys Glu Met Ala Ala Thr Ser 

500 505 510 

Ala Ala lie Glu Ala Ala Thr Thr Arg lie Glu Glu lie Leu Ser 

515 520 525 

Lys Ser Arg Ala Gly Asp Thr Gly Val Lys Leu Glu Val Asn Glu 

530 535 540 

Arg lie Leu Gly Ser Cys Thr Ser Leu Met Gin Ala lie Lys Val 

545 550 555 

Leu Val Val Ala Ser Lys Asp Leu Gin Lys Glu lie Val Glu Ser 

560 565 570 

Gly Arg Gly Ser Ala Ser Pro Lys Glu Phe Tyr Ala Lys Asn Ser 

575 588 585 

Arg Trp Thr Glu Gly Leu lie Ser Ala Ser Lys Ala Val Gly Trp 

590 595 600 

Gly Ala Thr He Met Val Asp Ala Ala Asp Leu Val Val Gin Gly 

605 610 615 

Lys Gly Lys Phe Glu Glu Leu Met Val Cys Ser Arg Glu He Ala 

620 625 630 

Ala Ser Thr Ala Gin Leu Val Ala Ala Ser Lys Val Lys Ala Asn 

635 640 645 

Lys Gly Ser Leu Asn Leu Thr Gin Leu Gin Gin Ala Ser Arg Gly 

650 655 660 

Val Asn Gin Ala Thr Ala Ala Val Val Ala Ser Thr He Ser Gly 

665 670 675 

Lys Ser Gin He Glu Glu Thr Asp Ser Met Asp Phe Ser Ser Met 

680 685 690 

Thr Leu Thr Gin He Lys Arg Gin Glu Met Asp Ser Gin Val Arg 

695 700 705 

Val Leu Glu Leu Glu Asn Asp Leu Gin Lys Glu Arg Gin Lys Leu 

710 715 720 

Gly Glu Leu Arg Lys Lys His Tyr Glu Leu Glu Gly Val Ala Glu 



725 



730 



735 
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Gly Trp Glu Glu Gly Thr Glu Ala Ser Pro Ser Thr Val Gin Glu 

740 745 750 

Ala lie Pro Asp Lys Glu 

755 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3964 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: mouse 

(ix) FEATURE: cDNA for Huntingtin-interacting protein - mHIPla 
(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

GGCACGAGGC GGCGCGCGGC CTCCGTGTGC CTAGGCTTGA GGCGGGCGGT 5 0 

GACGCCTCAT TCGCGCGGAG CCGGGCCGGG ACACGGTCGG CGGCAGCATG 10 0 

AACAGCATCA AGAATGTGCC GGCGCGGGTG CTGAGCCGCA GGCCGGGCCA 15 0 

C AG C C TAG AG GCCGAGCGCG AGCAGTTCGA CAAGACGCAG GCCATCAGTA 2 00 

TCAGCAAAGC CATCAACAGC CAGGAGGCCC CAGTGAAGGA GAAGCATGCC 2 50 

CGGCGTATCA TCCTGGGCAC GCATCATGAG AAGGGAGCCT TCACCTTCTG 3 00 

GTCCTATGCC ■ ATCGGCCTGC CGCTGTCCAG CAGCTCCATC CTCAGCTGGA 3 50 

AGTTCTGTCA CGTCCTTCAC AAGGTCCTCC GGGACGGACA CCCCAACGTC 40 0 

CTGCATGACT ATCAGCGGTA CCGGAGCAAC ATACGTGAGA TCGGTGACTT 450 

GTGGGGCCAC CTTCGTGACC AGTATGGACA CCTGGTGAAT ATCTATACCA 5 00 

AACTGTTGCT GACTAAGATC TCCTTCCACC TTAAGCACCC CCAGTTTCCT 55 0 

GCAGGCCTGG AGGTAACAGA TGAGGTGTTG GAGAAGGCGG CGGGAACTGA 600 

TGTCAACAAC ATTTTTCAGC TTACCGTGGA GATGTTTGAC TACATGGACT 65 0 

GTGAACTGAA GCTTTCTGAG TCAGTTTTCC GGCAGCTCAA CACGGCCATC 7 00 

GCAGTGTCCC AGATGTCTTC TGGCCAGTGT CGCCTAGCGC CGCTCATCCA 7 50 

GGTCATTCAG GACTGCAGCC ACCTGTACCA CTACACAGTG AAGCTCATGT 8 00 

TTAAGCTGCA CTCCTGTCTC CCGGCAGACA CCCTGCAAGG CCACAGGGAT 85 0 

CGGTTCCACG AGCAGTTCCA CAGCCTCAAA AACTTCTTCC GCCGGGCTTC 900 
AGACATGCTG TACTTCAAGA GGCTCATCCA GATCCCGCGG CTGCCTGAGG 95 0 

GACCCCCCAA TTTCCTGCGG GCTTCAGCCC TGGCTGAGCA CATCAAGCCG 1000 

GTGGTGGTGA TTCCCGAGGA GGCCCCAGAG GAAGAGGAGC CTGAGAACCT 1050 

AATTGAAATC AGCAGTGCGC CCCCTGCTGG GGAGCCAGTG GTGGTGGCTG 1100 

ACCTCTTTGA TCAGACCTTT GGACCCCCCA ATGGCTCCAT GAAGGATGAC 115 0 

AGGGACCTCC AAATCGAGAA CTTGAAGAGA GAGGTGGAGA CCCTCCGTGC 12 0 0 

TGAGCTGGAG AAGATTAAGA TGGAGGCACA GCGGTACATC TCCCAGCTGA 12 5 0 

AGGGCCAGGT GAATGGCCTG GAGGCAGAGC TGGAGGAGCA GCGCAAGCAG 13 00 

AAGCAGAAGG CCCTGGTGGA CAACGAGCAG CTGCGCCACG AGCTGGCCCA 13 5 0 

GCTCAAGGCC CTGCAGCTGG AGGGCGCCCG CAACCAGGGC CTTCGAGAGG 14 00 

AAGCAGAGAG GAAGGCCAGT GCCACGGAGG CACGCTACAG CAAGCTGAAG 145 0 

GAGAAACACA GCGAACTCAT TAACACGCAC GCCGAGCTGC TCAGGAAGAA 15 00 
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CGCAGACACG GCCAAGCAGC TGACAGTGAC ACAGCAGAGC CAGGAGGAGG 15 5 0 
TGGCACGGGT AAAGGAACAG CTGGCCTTCC AGATGGAGCA AGCGAAGCGT 16 00 
GAGTCTGAGA TGAAGATGGA AGAGCAGAGC GAGCAGTTGG AGAAGCTCAA 1650 
GAGGGAGCTG GCGGCCAGGG CAGGAGAGCT GGCCCGTGCG CAGGAGGCCC 17 00 
TGAGCCGCAC AGAACAGAGT GGGTCAGAGC TGAGCTCACG GCTGGACACA 175 0 
CTGAACGCGG AGAAGGAAGC CCTGAGTGGA GTCGTTCGGC AGCGTGAGGC 180 0 
AGAGCTGCTG GCCGCTCAGA GCCTGGTGCG GGAGAAGGAG GAGGCGCTTA 18 5 0 
GCCAAGAGCA GCAGCGGAGC TCCCAGGAGA AGGGCGAGCT ACGGGGGCAG 19 00 
CTGGCAGAAA AGGAGTCTCA GGAGCAGGGG CTTCGGGAGA AGC TGCTGGA 1950 
TGAGCAGTTG GCGGTGTTGC GAAGTGCAGC CGCCGAGGCA GAGGCCATCC 2 000 
TACAGGATGC AGTGAGCAAG CTGGACGACC CCCTGCACCT CCGCTGCACC 2 05 0 
AGCTCCCCAG ACTACTTGGT GAGCCGGGCT CAGGCAGCGC TGGACAGCGT 2100 
GAGCGGCCTG GAGCAGGGCC ACACCCAGTA CGTGGCTTCC TCCGAAGATG 215 0 
CTTCTGCCCT GGTGGCAGCG CTGACCCGCT TCTCCCATTT GGCTGCGGAC 22 0 0 
ACCATTGTCA ATGGTGCCGC CACCTCCCAC CTGGCCCCCA CCGACCCCGC 2250 
CGACCGCCTG ATGGACACAT GCAGGGAGTG TGGAGCCCGG GGTCTGGAGC 230 0 
TGGTGGGACA GCTGCAAGAC CAGACAGTGC TACGGAGGGC TCAGCCCAGC 2 350 
CTGATGCGGG CCCCCCTGCA GGGCATTCTG CAGTTGGGCC AGGACTTGAA 2 4 00 
GCCTAAGAGC C TGG ATGT AC GGCAAGAGGA GCTAGGGGCC ATGGTGGACA 245 0 
AGGAGATGGC GGCCACCTCG GCAGCCATTG AGGACGCTGT GC GG AGGATC 2 500 
GAGGACATGA TGAGCCAGGC CCGCCACGAG AGGTCAGGCG TGAAACTGGA 2 550 
GGTGAATGAG AGGATCCTCA ACTCCTGCAC AGACCTGATG AAGGCTATCC 2 60 0 
GGCTCCTGGT GATGACCTCC ACCAGCCTGC AGAAGGAAAT TGTGGAGAGC 2 650 
GGCAGGGGGG CAGCAACGCA GCAGGAATTT TATGCCAAGA ATTCACGGTG 27 00 
GACTGAAGGC CTCATCTCAG CCTCTAAGGC AGTGGGC TGG GGAGCCACAC 275 0 
AGC TGGTGG A GTCAGCTGAC AAGGTTGTGC TTCACATGGG CAAATACGAG 2 80 0 
GAACTCATCG TCTGCTCCCA TGAGATTGCG GCCAGCACGG CCCAGCTGGT 285 0 
GGCAGCCTCG AAGGTGAAAG CCAACAAGAA CAGTCCCCAC TTGAGCCGCC 2 900 
TGCAGGAATG TTCCCGCACT GTCAACGAGA GGGCTGCCAA CGTCGTGGCC 29 5 0 
TCCACCAAAT CTGGCCAGGA GCAGATTGAG GACAGAGACA CCATGGATTT 3 000 
CTCTGGCCTG TCCCTCATCA AGTTGAAGAA GCAGGAGATG GAGACACAGG 3 05 0 
TGCGAGTCTT GG AGC TGG AG AAG AC AC TAG AGGCAGAGCG TGTCCGGCTC 3100 
GGGGAGCTTC GGAAACAGCA CTATGTACTG GCTGGGGGGA TGGGAACACC 3150 
TAG C G AAG AA GAACCCAGCA GACCCAGCCC AGCTCCCCGA AGTGGGGCCA 32 00 
CTAAGAAGCC ACCGCTGGCC CAGAAACCCA GCATAGCCCC CAGGACAGAC 325 0 
AACCAGCTCGA CAAAAAGGAT GGTGTCTACC CAGCTCAACT TGTGAACTAC 3 3 00 
TAGGCCCCTAA GGTGTTCAGC AGGATGGCTG GTGGTTGTGC CTGGGCTTCA 3 3 50 
TGTGGCTGTCT GGCAGTGGTC AAGGGGCCTC TGAGAAGCCT CCAACTCCTG 340 0 
CCCAAGGGGCC TAGTCTGTGG GACAGTTCAT CTGGATGTGA ATC TATTT AT 345 0 
CTTAAGTAGGA ACTGCCTCGA GCAGCTGGGA CCCAGCAGGC C TG AGC C AC A 3 500 
AATCTGCAGCG GACATCAGAG ATAGTCTGAA TGCTGCGAGG TATTTCTTTC 3 5 50 
TTCGTAAGTTT AGTCAGCACA CTGGGAAAAG GTCACATAAG CCAGGAGCCT 3 600 
CCTTGTCTCTG GACTCAAAAG TCTGAGGCCT TAAGTGAACA ACAGAAAGAG 3 65 0 
GGTCCCTGCTG GCTACCAGGG ATAAGGGGAT GACCTGTGAC CCTTGAGCCA 37 00 
GGGAGAGCAGG TAAGCTGGGT GGTGTCATCA CCTGGGGGCC TGGTGCTAGG 37 5 0 
GCATCCATGCT GGGAGCCCCA GG AG AC C AGG CTTTGTGTGG GAGCCTGGCA 3 800 
TCATCGTGGCT GGGGCAGCCC CTGCTCAGGT GCTGTCTCTG CCCGTGACCT 3 85 0 
TGAAGCCACCC TCCCCCCGTA CAGTTTTCCA TTCTCCTGGC TACTAGTGTG 39 00 
GCTGTTCATTG CCTACCTTGA TGAGTAGATT TCAGCCCTCC TAAAGCTGGG 395 0 
GCCTTTCCTCG TGCC 3 9 64 
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(2) INFORMATION FOR SEQ ED NO: 1 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 676 

(B) TYPE: protein 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: mouse 

(ix) FEATURE: Huntingtin-interacting protein -mHIPla 
(xi) SEQUENCE DESCRIPTION: SEQ ED NO:ll: 

Met Asn Ser lie Lys Asn Val Pro Ala Arg Val Leu Ser Arg Arg 

5 10 15 

Pro Gly His Ser Leu Glu Ala Glu Arg Glu Gin Phe Asp Lys Thr 

20 25 30 

Gin Ala lie Ser lie Ser Lys Ala lie Asn Ser Gin Glu Ala Pro 

35 40 45 

Val Lys Glu Lys His Ala Arg Arg lie lie Leu Gly Thr His His 

50 55 60 

Glu Lys Gly Ala Phe Thr Phe Trp Ser Tyr Ala lie Gly Leu Pro 

65 70 75 

Leu Ser Ser Ser Ser lie Leu Ser Trp Lys Phe Cys His Val Leu 

80 85 90 

His Lys Val Leu Arg Asp Gly His Pro Asn Val Leu His Asp Tyr 

95 100 105 

Gin Arg Tyr Arg Ser Asn lie Arg Glu lie Gly Asp Leu Trp Gly 

110 115 120 

His Leu Arg Asp Gin Tyr Gly His Leu Val Asn lie Tyr Thr Lys 

125 130 135 

Leu Leu Leu Thr Lys lie Ser Phe His Leu Lys His Pro Gin Phe 

140 145 150 

Pro Ala Gly Leu Glu Val Thr Asp Glu Val Leu Glu Lys Ala Ala 

155 160 165 

Gly Thr Asp Val Asn Asn lie Phe Gin Leu Thr Val Glu Met Phe 

170 175 180 

Asp Tyr Met Asp Cys Glu Leu Lys Leu Ser Glu Ser Val Phe Arg 
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185 190 195 

Gin Leu Asn Thr Ala lie Ala Val Ser Gin Met Ser Ser Gly Gin 

200 205 210 

Cys Arg Leu Ala Pro Leu lie Gin Val lie Gin Asp Cys Ser His 

215 220 225 

Leu Tyr His Tyr Thr Val Lys Leu Met Phe Lys Leu His Ser Cys 

230 235 240 

Leu Pro Ala Asp Thr Leu Gin Gly His Arg Asp Arg Phe His Glu 

245 250 255 

Gin Phe His Ser Leu Lys Asn Phe Phe Arg Arg Ala Ser Asp Met 

260 265 270 

Leu Tyr Phe Lys Arg Leu lie Gin lie Pro Arg Leu Pro Glu Gly 

275 288 285 

Pro Pro Asn Phe Leu Arg Ala Ser Ala Leu Ala Glu His lie Lys 

290 295 300 

Pro Val Val Val lie Pro Glu Glu Ala Pro Glu Glu Glu Glu Pro 

305 310 315 

Glu Asn Leu lie Glu lie Ser Ser Ala Pro Pro Ala Gly Glu Pro 

320 325 330 

Val Val Val Ala Asp Leu Phe Asp Gin Thr Phe Gly Pro Pro Asn 

335 340 345 

Gly Ser Met Lys Asp Asp Arg Asp Leu Gin lie Glu Asn Leu Lys 

350 355 360 

Arg Glu Val Glu Thr Leu Arg Ala Glu Leu Glu Lys lie Lys Met 

365 370 375 

Glu Ala Gin Arg Tyr lie Ser Gin Leu Lys Gly Gin Val Asn Gly 

380 385 390 

Leu Glu Ala Glu Leu Glu Glu Gin Arg Lys Gin Lys Gin Lys Ala 

395 400 405 

Leu Val Asp Asn Glu Gin Leu Arg His Glu Leu Ala Gin Leu Lys 

410 415 420 

Ala Leu Gin Leu Glu Gly Ala Arg Asn Gin Gly Leu Arg Glu Glu 

425 430 435 

Ala Glu Arg Lys Ala Ser Ala Thr Glu Ala Arg Tyr Ser Lys Leu 
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440 



445 



450 



Lys Glu Lys His Ser Glu Leu lie Asn Thr His Ala Glu Leu Leu 

455 460 465 

Arg Lys Asn Ala Asp Thr Ala Lys Gin Leu Thr Val Thr Gin Gin 

470 475 480 

Ser Gin Glu Glu Val Ala Arg Val Lys Glu Gin Leu Ala Phe Gin 

485 490 495 

Met Glu Gin Ala Lys Arg Glu Ser Glu Met Lys Met Glu Glu Gin 

500 505 510 

Ser Asp Gin Leu Glu Lys Leu Lys Arg Glu Leu Ala Ala Arg Ala 

515 520 525 

Gly Glu Leu Ala Arg Ala Gin Glu Ala Leu Ser Arg Thr Glu Gin 

530 535 540 

Ser Gly Ser Glu Leu Ser Ser Arg Leu Asp Thr Leu Asn Ala Glu 

545 550 555 

Lys Glu Ala Leu Ser Gly Val Val Arg Gin Arg Glu Ala Glu Leu 

560 565 570 

Leu Ala Ala Gin Ser Leu Val Arg Glu Lys Glu Glu Ala Leu Ser 

575 588 585 

Gin Glu Gin Gin Arg Ser Ser Gin Glu Lys Gly Glu Leu Arg Gly 

590 595 600 

Gin Leu Ala Glu Lys Glu Ser Gin Glu Gin Gly Leu Arg Gin Lys 

605 610 615 

Leu Leu Asp Glu Gin Leu Ala Val Leu Arg Ser Ala Ala Ala Glu 



Ala Glu Ala lie Leu Gin Asp Ala Val Ser Lys Leu Asp Asp Pro 



Leu His Leu Arg Cys Thr Ser Ser Pro Asp Tyr Leu Val Ser Arg 

650 655 660 

Ala Gin Ala Ala Leu Asp Ser Val Ser Gly Leu Glu Gin Gly His 

665 670 675 

Thr Gin Tyr -Leu Ala Ser Ser Glu Asp Ala Ser Ala Leu Val Ala 

680 685 690 

Ala Leu Thr Arg Phe Ser His Leu Ala Ala Asp Thr lie Val Asn 



620 



625 



630 



635 



640 



645 
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695 700 705 

Gly Ala Ala Thr Ser His Leu Ala Pro Thr Asp Pro Ala Asp Arg 

710 715 720 

Leu Met Asp Thr Cys Arg Glu Cys Gly Ala Arg Ala Leu Glu Leu 

725 730 735 

Val Gly Gin Leu Gin Asp Gin Thr Val Leu Arg Arg Ala Gin Pro 

740 745 750 

Ser Leu Met Arg Ala Pro Leu Gin Gly lie Leu Gin Leu Gly Gin 

755 760 765 

Asp Leu Lys Pro Lys Ser Leu Asp Val Arg Gin Glu Glu Leu Gly 

770 775 780 

Ala Met Val Asp Lys Glu Met Ala Ala Thr Ser Ala Ala lie Glu 

785 790 795 

Asp Ala Val Arg Arg lie Glu Asp Met Met Ser Gin Ala Arg His 

800 805 810 

Glu Ser Ser Gly Val Lys Leu Glu Val Asn Glu Arg lie Leu Asn 

815 820 825 

Ser Cys Thr Asp Leu Met Lys Ala lie Arg Leu Leu Val Met Thr 

830 835 840 

Ser Thr Ser Leu Gin Lys Glu lie Val Glu Ser Gly Arg Gly Ala 

845 850 855 

Ala Thr Gin Gin Glu Phe Tyr Ala Lys Asn Ser Arg Trp Thr Glu 

860 865 870 

Gly Leu lie Ser Ala Ser Lys Ala Val Gly Trp Gly Ala Thr Gin 

875 888 885 

Leu Val Glu Ser Ala Asp Lys Val Val Leu His Met Gly Lys Tyr 

890 895 900 

Glu Glu Leu lie Val Cys Ser His Glu lie Ala Ala Ser Thr Ala 

905 910 915 

Gin Leu Val Ala Ala Ser Lys Val Lys Ala Asn Lys Asn Ser Pro 

920 925 930 

His Leu Ser Arg Leu Gin Glu Cys Ser Arg Thr Val Asn Glu Arg 

935 940 945 

Ala Ala Asn Val Val Ala Ser Thr Lys Ser Gly Gin Glu Gin lie 
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950 955 960 

Glu Asp Arg Asp Thr Met Asp Phe Ser Gly Leu Ser Leu lie Lys 

965 970 975 

Leu Lys Lys Gin Glu Met Glu Thr Gin Val Arg Val Leu Glu Leu 

980 985 990 

Glu Lys Thr Leu Glu Ala Glu Arg Val Arg Leu Gly Glu Leu Arg 

995 1100 1105 

Lys Gin His Tyr Val Leu Ala Gly Gly Met Gly Thr Pro Ser Glu 

1110 1115 1120 

Glu Glu Pro Ser Arg Pro Ser Pro Ala Pro Arg Ser Gly Ala Thr 

1125 1130 1135 

Lys Lys Pro Pro Leu Ala Gin Lys Pro Ser lie Ala Pro Arg Thr 

1140 1145 1150 

Asp Asn Gin Leu Asp Lys Lys Asp Gly Val Tyr Pro Ala Gin Leu 

1155 1160 1165 



Val Asn Tyr 



(2) INFORMATION FOR SEQ ID NO: 12: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GAAGATACCC CACCAAAC 18 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 
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(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GCTTGACAGT GTAGTCATAA AGGTGGCTGC AGTCC 35 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GGACATGTCC AGGGAGTTGA ATAC 24 



(2) INFORMATION FOR SEQ ID NO: 15: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: yes 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

CUACUACUAC UACUAGGCCA CGCGTCGACT AGTACGGGII GGGIIGGGH G 41 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 516 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 
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(x) FEATURE: exon 1 of HIP1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

TCTGTGGAAG GTTTGGAGGG GAGAGAGGGG CAGCTGGATG CTCTTGGGCC ACGGTCGCCC 60 

CTGATCTCTG CGCCTCTTCC TCCTGCTCCG GGAGAAATAA TGTTTCCCTG GGGGATGAAA 120 

GCATCTCTTT GTGCGGGCTT T AATTGC C AT GTTGTTGTGC CAAGGGAGTG AGTGGCGGCG 180 

GGACCAGCAG CTGGGCACAG CCAATGCCAG GCAGTGGTGC CCACTCCCTC AGGACGCCCA 240 

GCCAGCTGGC TCCTGGGAGC GCTGCCCACC TCTGCCCCCA GCTGGGCGCC TGCAAGGAAC 300 

CGACCACCCG TGGGGCTGGG GGAGGTTGGC TGGAGGAGGA GAAAGGGGCG GGCTCTGGGA 3 60 

GGGTCTCAGC C AC TC TC AG A GGCTTATTCA TCTCATCCTC CTTTCCCTCC CCCTTCTTGT 420 

TTTTCAGACT GTCAGCATCA ATAAGGCCAT TAATACGCAG GAAGTGGCTG TAAAGGAAAA 480 

ACACGCCAGA AATATCCTTT GGATGTTGCT TGGAAG 516 



(2) INFORMATION FOR SEQ ID NO: 17: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 2 of HIP 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

TGTTTTC CAT AACCCCCCCT CACCGTGCAT ACTGGGCACC 
GACCTTCTGG TCTGTTGTCA ACCGCCTGCC TCTGTCTAGC 
GTTCTGC CAT GTGTTCCACA AACTCCTCCG AGATGGACAC 
CTATGGGGTG GCA 

(2) INFORMATION FOR SEQ ID NO: 18: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 3 of HIP 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

GTGTTCTTTT GCCCCTGCAG GTCCTGAAGG ACTCTCTGAG ATACAGAAAT GAATTGAGTG 6 0 
ACATGAGCAG GATGTGGGTG AGTTTGGAGA TGTACTCAGG AGCC 104 

(2) INFORMATION FOR SEQ ID NO:20: 
(i) SEQUENCE CHARACTERISTICS: 



CACCATGAGA AAGGGGCACA 6 0 
AACCCAGTGC TCTGCTGGAA 120 
CCGAACGTGA GTTCCTGGGG 18 0 

193 
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(A) LENGTH: 327 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 4 of HIP 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

AATTCCTGGC TGCAGATCTC TTGACTGTTA TGTTCTTGTT GTTGACTCTG TTTCCCCTCC 
TCTTCCTAAA AGGGCCACCT GAGCGAGGGG TATGGCCAGC TGTGCAGCAT CTACCTGAAA 
CTGCTAAGAA CCAAGATGGA GTACCACACC AAAGTGAGTC TCTGCGGACA GTTCTGCCGC 
CACCGCCGCC TCCCCTGCTC CATCCCTTCA GCCCCTCCCT GGGCTCATTT GTCAGCTCTT 
TCAGGTAATA GACAGCCCAG GCTTCTGAGG AAGTGTGCAC ATCATGTACC CAAGCTGTGA 
GAGAGGAAAG CCACCGCCAG GCCCACG 



60 
120 
180 
240 
300 
327 



(2) INFORMATION FOR SEQ ID NO:2 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 331 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 5 of HIP 1 

(xi) SEQUENCE DESCRIPTION: SEQ ED NO: 21: 

GGGCTCAAGC AATCCTCCCA CCTCGGCCTC CCAAGTAGCT GGG AC C AC AG GCGTGTGCCA 6 0 
CCACGCCCGG CTGAGAGAGG GCTCTTCATG TCTTCTGCCC TGACTCCCTT CCTCTGCCTC 12 0 
CCTTCCAGAA TCCCAGGTTC CCAGGCAACC TGCAGATGAG TGACCGCCAG CTGGACGAGG 18 0 
CTGGAGAAAG TGACGTGAAC AACTTGTAAG TGGCTCCTGC CCTGAGCCCA GGGAGGGAGA 24 0 
AAGCTTTTGT GAATGCTGAC ACTTCTCATA AGGGTCATGG AGGGCCTGAT GGGGGGAGGC 300 
CGTGGCTGGG ATGGGGACCA AAGCCCCTGG G 331 



(2) INFORMATION FOR SEQ ID NO:22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 470 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
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(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 6 of HEPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

ACTGTCGCTG TCACTGTTGA CTTCACCAGG CTGCATGGCC AT AAT AC CCA CAAGGCTAAG 60 

ACTTGGAGCT GGAGTTGTGT GTGTGTTTGC GCATGCACAT GAGCATTGGA GACTGGAGTA 12 0 

GCGTAGAGCG TGGGGGAGGG GACAGGTAAC AGACCGGCCT CAGGCTGTGG AGTGTAAGCT 18 0 

CTCTTTCCTC TTGGGTCCAG TTTCCAGTTA ACAGTGGAGA TGTTTGACTA CCTGGAGTGT 240 

GAACTCAACC TCTTCCAAAC AGGTGAGTCT CTTCCCTCCC GTCTAACCCA GGCTCTCATG 3 00 

GGAACTACCT AATTCCTAGT CCTCCTCTCC CTGCAAAGTG TGCAGCACAA GGGGTAGGAA 360 

AATGGAGACA TTCACACCCC ATCTCTGGTC TCTCCAACCC TCGTGCAGGG AGGGACTGAA 42 0 

CCTCTTCAGT ATTTTTCTTT TTAAGAGACA AGGTCTCGGC CGGGTGCAGT 47 0 



(2) INFORMATION FOR SEQ ID NO:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 565 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 7 of HIP 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

TCTTCACCTG TTTAATGGGG ATACGTTTAC CTATCTCATG GGAGTGTTGT GAAGGTTAAA 6 0 

TGAATTAGAT GAGGTAAAGC ACGCACAGAA TCGGTCCTTG GTGTATGTTG GACCCCTGCC 120 

TCTGCCCCTC TGAAGAGGCT GCCTGTAATC CCCTGGCTCT ACCACCTTTC TCCCTCACTT 18 0 

TTATTTCCTA GTATTCAACT CCCTGGACAT GTCCCGCTCT GTGTCCGTGA CGGCAGCAGG 24 0 

GCAGTGCCGC CTCGCCCCGC TGATCCAGGT CATCTTGGAC TGCAGCCACC TTTATGACTA 3 00 

CACTGTCAAG CTTCTCTTCA AACTCCACTC CTGTGAGTAC CGCGGGCCAG ATCTTCTTAC 36 0 

ATGAGATTCA GGCCAGAGGG AGGATCCCAG CCTGAGGATG TCCCCAGAGA AACGCAGTCC 42 0 

TTCTCAGTGC CTTTGGCTGT CTGCTTCTGT TCCAAAAGGC CCCGGAGCTT CTGACCATTG 480 

TGAGGATAAA AGAGCAGGGC CCAGGCTTTG GTGACCCCAG TAAAGCCCCT GGCTTGCCAC 54 0 

TCTTGCGTCC AGTGTTACAG GATCT 56 5 



(2) INFORMATION FOR SEQ ID NO:24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 233 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 8 of FflPl 
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(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

GGGACAGCTC TAGGCCAGTC GTGGCCCCTG GCAGTGCTGG CCACATGCCC CAGGGTAGCT 6 0 

GGGCCCCTCC CCCTCGAGAG CCCCGCTGTG GCTTCCCTGC CCTCTGGTCC CCCTCCCCTC 12 0 
TCACACTCTT TCCAATTTCT TCCAGGCCTC CCAGCTGACA CCCTGCAAGG CCACCGGGAC 180 
CGCTTCATGG AGCAGTTTAC AAAGTAAGTG GTTCAAGTAA CAGGAATGGA GGT 233 



(2) INFORMATION FOR SEQ ID NO:25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 578 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exons 9 and 10 of HIP 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

TGAATCCCAG CACCATGGAG TTTATCTCCT TGACAGCCTG TGCCTTTGGG C TGGGG AGGG 6 0 

GGCAGGAAAG CCAGGTGGCT GCTCTGTCCC CTACATGGGG CTGATGAAGA CACCCAGCAC 12 0 
CCCTCAGGTC CTTCTCCACC CCTAGGTTGA AAGATCTGTT CTACCGCTCC AGCAACCTGC 180 
AGTACTTCAA GCGGCTCATT CAGATCCCCC AGCTGCCTGA GGTAAGCATG CCCAACCACA 240 
CACCCTCGGC ACTGCAGAGG CCCCAGGTAC TCTCTTAAGG GCCGGCGGGG CCTGGCAAGC 3 00 
AAGCACTATT TGAGGATGTG TCTCCGTCTT CAGAACCCAC CCAACTTCCT GCGAGCCTCA 3 60 
GCCCTGTCAG AACATATCAG CCCTGTGGTG GTGATCCCTG CAGAGGCCTC ATCCCCCGAC 420 
AGCGAGCCAG TCCTAGAGAA GGATGACCTC ATGGACATGG ATGCCTCTCA GCAGGTGAGG 480 
AC C ACTTGGG AGAGAAACTT GGCCTTTCCT CTCACCTGCA AGTACAGGGG AGAGGCTGGG 540 
GGAGACCCTG GCCAAAGCCC ATTGACTCTA ACCAGGTT 57 8 



(2) INFORMATION FOR SEQ ID NO:26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 390 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 1 1 of HIP 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

AAAAAAATTT AAAAAATTAA ACAGGTCTGA ACCGTTTAAT TCGAGAAAGG GGGCATTCTC 60 
CCATATCACT CAACTGACCC ACACACAGAA TTCTCTGGCT CTCTGACTTA TTCTCACTCC 120 
TTTTTGGTCA ACCACAGAAT TTATTTGACA ACAAGTTTGA TGACATCTTT GGCAGTTCAT 180 
TCAGCAGTGA TCCCTTCAAT TTCAACAGTC AAAATGGTGT GAACAAGGAT GAGAAGTGAG 240 
TCCAAGCTGG GTTCAAGCAG ATGGTTCAGG AGCTAAGTTA AGCCATGGTC TGCCTCAAAA 300 
CACTAACCAA AGAGGAATTC TTAATGATAC TGGGGCTTCT TAGATACAGA ACATCTTGAA 3 60 
GGGTTGGGGG CAATGGCTTA TGCCTGTAAT 390 
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(2) INFORMATION FOR SEQ ID NO:27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 547 

(B) TYPE: nucleic acid 

(C) STR ANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 12 of HIP1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

AAAATCAATA AC C ATGG ATT TATGAGTATT AGATTAGTAT C TGGTAAC AT TTAGAGTATA 60 

ATTTATGGCA TTTCAAAGAA TTGTCCCCAA ATTAATACCA GCTTTTAATT TCCTCCCCTG 12 0 

AGCTCACAAT TAAAAACAGA GGGATAGAAG CACTATGAAA GCAAACTCAT TCCCCTTCTC 180 

TTCCCAGGGA CCACTTAATT GAGCGACTAT ACAGAGAGAT CAGTGGATTG AAGGCACAGC 24 0 

TAGAAAACAT GAAGACTGAG GTATAACTTG GATCTGCTCT GCCTTTGCGC TTCACCAAAA 3 00 

CACGGTAGAT TTGAATGTTA AATTTGCATC ACACTAGCCA GGCACAGTGG CTCACACCTG 3 60 

TAATCCTAGC AC TTTGGG AG GCCAAGGCAG G AGG ATT AC C TGAGGTCGGG AGTTCGAGAC 42 0 

CAGCCTGGGC AACAGGGTGA AACCCCCGTC TTCAATAAAA ATGCAATAAT TAGCCGGGTG 48 0 

TGTTGGCAGG C AC CTGT AAT CCCAGCTACT CGGGAAGCTG AGGCATGAGA ATTGCTTGAA 540 

CTTGGGA 547 



(2) INFORMATION FOR SEQ ID NO:28: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 436 

(B) TYPE: nucleic acid 

(C) STR ANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 13 of HIP 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

CCCCCAGCCA CTCTAAAGAG GACCACAATT CCCCGGCCAT CATCCCCTGT TATTGTTGTT 60 
GATTGAGGGG CTCCTAATGA CCAGATGGTC CAACCCTCCT GGGACGTGGA GAGTTGACTT 12 0 
AGGGGAATCA GGTATTTACT TGGAAGCATG GTAGGACCCG CTTCTCCGGC CCATGCCCGT 180 
GACCCGTGGC AGTGGGCGGT TGGCCTCATG ACCGGAGTCC CCCCACAGAG CCAGCGGGTT 24 0 
GTGCTGCAGC TGAAGGGCCA CGTCAGCGAG CTGGAAGCAG ATCTGGCCGA GCAGCAGCAC 300 
CTGCGGCAGC AGGCGGCCGA CGACTGTGAA TTCCTGCGGG CAGAACTGGA CGAGCTCAGG 3 60 
AGGCAGCGGG AGGACACCGA GAAGGCTCAG CGGAGCCTGT C TG AG AT AG A AAGTGAGCGG 42 0 
TGGGTGGGGG CGGGGG 43 6 

(2) INFORMATION FOR SEQ ID NO:29: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 469 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 14 of HEP1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

GACTTGAGCC CAAGGAGGTC AAGGCTGCAG TGAACAGTGA TTGTGCCACT GCACCCCAGC 60 

CTGGGTGACA GAGCAAGACT GTCTCAAAAC AAAACAAGGA GGACCTTCTA GGGACCCTGG 12 0 

CTCATTGCAA GGAAGGCAAG GGTCCCTGCT AGGTTAGACT CCTCACCTTG GTCCTTTACA 18 0 

ATACAGGGAA AGCTCAAGCC AATGAACAGC GATATAGCAA GCTAAAGGAG AAGTACAGCG 240 

AGCTGGTTCA GAACCACGCT GACCTGCTGC GGAAGGTAAG ACCCTCAGCC CCTGTCACCA 3 00 

TCCTGCAGGC CCTGCACCTC TAGGGAGAGA GCGGCTCAGG CCTGTGGCTT CCCCGGGGCC 36 0 

AGCAACCCCT ACATTGATCT CTAAGGCATT GCCGTCATCT CGGGAACCAC ACCTTTTCAG 42 0 

GCTTCCTTGC CTCTGTGTCT TGGGCTGTGT CCTGGGTGCC AATCCCATG 469 
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(2) INFORMATION FOR SEQ ID NO:30: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 359 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 15 of HIP1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

GGGTAGGAAA GTGATTCCTG TGTCTGACTC TAGGGCACGC ACAGCCTGAG TATGATTGTC 60 
CTAGAAGGAG GATGTCCTCT AAGCCTGGGA TCTCCTGGTT CAAGACACTG TTCTTCTTTT 12 0 

GCAGAATGCA G AGGTG AC C A AACAGGTGTC CATGGCCAGA CAAGCCCAGG TAGATTTGGA 18 0 

AC GAG AG AAA AAAGAGCTGG AGGATTCGTT GGAGCGCATC AGTGACCAGG GCCAGCGGAA 24 0 

GGTGAGTGGG ACGAGGAGCA CTCGGGAAAT GAGGGAGGGG GCTGTTGAGT TGGTGGCGGG 300 
GGCTTTGTGG CCTTCTGCTC CATGGGCAGT TCTGTGGGTC GGTTGGCATC ACACAGCAG 3 59 

(2) INFORMATION FOR SEQ ID NO:31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 209 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 
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(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 16 of HJP1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 



GTTGATCGCT TGGGACGTTT TTACATTTTT ATATTCTTTG TCACTGTCAC CCAGATCAGA 60 

GTCCCTCTGT TTTTCTTCTC TTTCAGACTC AAGAACAGCT GGAAGTTCTA GAGAGCTTGA 12 0 

AGCAGGAACT TGCCACAAGC CAACGGGAGC TTCAGGTTCT GCAAGGCAGC CTGGAAACTT 180 

CTGCCCAGGT AAATACCTCC TTTTTTTTT 209 



(2) INFORMATION FOR SEQ ID NO:32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 485 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 17 of HIP1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 



CCCCCACTGC AATCAGTGTG TCCCCGGGAG GGAATCAGAG TGGCAGGTTA AAGAGCCATC 6 0 

ACCTTCCCAG TCCTTGCAAC CCGGTGGTGG GTTGGACCTC TGGGAAGTAG GGACTGTTTA 12 0 

ACTCAACCAG CGTCTCCCTC TTTCC TTGTG GTCACCTTTG CAGTCAGAAG CAAACTGGGC 180 

AGCCGAGTTC GCCGAGCTAG AGAAGGAGCG GGACAGCCTG GTGAGTGGCG CAGCTCATAG 240 

GGAGGAGGAA TTATCTGCTC TTCGGAAAGA ACTGCAGGAC ACTCAGCTCA AACTGGCCAG 300 

CACAGAGGGT CACGGACATG GACACGAGCG AGCACCTGTG AATTCCCACC GAGGGCCTCT 360 

GCGCATGCAC GGAGGCTGGG AGGACCCCGG GGCTGCTGAG AAGGGGTTTG GGGCCTTGGC 420 

CTGATTGTGC AGACATTCTG TAGGTGTAAT GCCAGCAGGC CCTGCATTGC CTGCAGAGTC 480 

CATGA 485 



(2) INFORMATION FOR SEQ ID NO:33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 468 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 18 of HIP 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

TTACTGGCTT GGACCTCATT GGCCATGACT TGAGCTAAGA TGCTAAGAGC CCCAGCCAGG 6 0 

TCATCCTGCT CAGGTTCATT ATGGAGTCTA GGGCAGACTC TCACCTCCCT GGACCATTTT 120 
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TAGAATCTAT GTGCCAGCTT GCCAAAGACC AACGAAAAAT GCTTCTGGTG GGGTCCAGGA 180 

AGG CTGCGG A GCAGGTGATA CAAGACGCCC TGAACCAGCT TGAAGAACCT CCTCTCATCA 240 

GCTGCGCTGG GTCTGCAGGT ACACTTGCAA TTGCCCAGCT GGCAGGGGCC AGGTCCTTAC 3 00 

AGCCTGAGAC TCTGTTGATG TTGAATCTCA TGTGAGACTT AGCTCAGGGG CTCTCAGCCC 3 60 

AGCAGCATGT CAGCATTACC TTAGGGGCGC CCAGGCCCCA TCCTAGATCA GTTACATGTG 420 

GAAACTCTGT GCATTAGTGC CTATACACTA GTATTTTAGT ATTTTCTT 468 



(2) INFORMATION FOR SEQ ID NO:34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 393 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 19 of HIP1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 



CACTAGTAAG CTCCTCCATT CAGTGCTTAA TTAACGAGGA TGAAGCCAGC TATGAGAACT 60 

TGCTCTGACC TTGCCCTGTG TTCCCTCTCA CAGATCACCT CCTCTCCACG GTCACATCCA 120 

TTTCCAGCTG CATCGAGCAA CTGGAGAAAA GCTGGAGCCA GTATCTGGCC TGCC CAGAAG 180 

GTAAGAATGG CCAAGGACAG TCTCTGTCGG CTAGTGATGG CCAGACAGGG TTCAGAAGCA 240 

CCTGAATGCG GGGATAGTGA CAGGTCCCTC TGCATCAAGA AAGGCATGTA GGCAACTCAT 3 00 

ACAAGAAAGG CATGTAGGCA ACTCATAAAA CGGGAGGAGA GGGTATGAAA GTGTCACCAT 3 60 

CAACCAGACC TGAGAAACTT CTCTTTCCAA TCC 393 



(2) INFORMATION FOR SEQ ID NO:35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 421 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 20 of HIP1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 



GGCCTGCCCA GAAGGTAAGA ATGGCCAAGG ACAGTCTCTG TCGGCTAGTG ATGGCCAGAC 60 

AGGGTTCAGA AGCACCTGAA TGCGGGGATA GTGACAGGTC CCTCTGCATC AAGAAAGGCA 120 

TGTAGGCAAC TCATACAAGA AAGGCATGTA GGCAACTCAT AAAACGGGAG GAGAGGGTAT 180 

GAAAGTGTCA CCATCAACCA GACCTGAGAA ACTTCTCTTT CCAATCCTGG CAGACATCAG 240 

TGGACTTCTC CATTCCATAA CCCTGCTGGC CCACTTGACC AGCGACGCCA TTGCTCATGG 3 00 

TGCCACCACC TGCCTCAGAG CCCCACCTGA GCCTGCCGAC TGTGAGTACT GGGGCATGAG 3 60 

GGGCTGTTCA TGGACCAGGG GAGCAGGGGG CCTTTAAAAG TCTCTGTTGG GCCGGGCGCA 42 0 

G 421 
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(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 498 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 21 of HIP1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

AGGCCGAGGC AGGAGAATCG CTTGAACTCA GGAGGCGGAG TTTGCAGTGA GCCGAGATGG 60 

CGCCACTGCA CTCCAGCCTG GGCAACAAGA GCGAGACTCC ATCTCAAAAA AAAAGTGTCT 120 

ATTGCCTTGT ATCTCCAGCA CTGACCGAGG CCTGTAAGCA GTATGGCAGG GAAACCCTCG 180 

CCTACCTGGC CTCCCTGGAG GAAGAGGGAA GCCTTGAGAA TGCCGACAGC ACAGCCATGA 240 

GGAACTGCCT GAGCAAGATC AAGGCCATCG GCGAGGTACT TGGAGTAGTA TCATTGAGGA 3 00 

GCATTGTTAT TCTTCTGGGT GTGCGTGCTG GTGAATGGCC AGGGAATCGG TGATGTTCTG 3 60 

AGCTAGTTCT TTCTGCACTT AGAAC TTGAT TCTAGAAAGA GATTGTTAAA ATTGGAAAAT 420 

CTGGCCGGGT GCAGTGATTT ATGCGTGTAA TCCCAGCACT TTGGGAGGCC GAGTCAGGAG 480 

GATCACTTGA GGCTAGAC 498 



(2) INFORMATION FOR SEQ ID NO:37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 427 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 22 of HIP1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

CCCTGTGGCT TGCAGAAGGT GTTTGCTGGG TGGCCTCCTG CCTTGCCATC TTGTAAGGGT 60 
TACAGATGGC AGAGGAGAAG AGACAGGAGG CCCCAAGGTC AGTTCAGCCT TTGTGATGTG 120 
TTCACAGGAG CTCCTGCCCA GGGGACTGGA CATCAAGCAG GAGGAGCTGG GGGACCTGGT 180 
GGACAAGGAG ATGGCGGCCA CTTCAGCTGC TATTGAAACT GCCACGGCCA GAATAGAGGT 240 
AGGAGGTTCC TGCAGGATCT CCTGAAACGA TGCCTTTGCA GCTGCCCTTC TGCAACACTG 3 00 

CTCATTAAAC ATGTCACAGT CGTTCATTAA GGCCATGGCA ACCCCCTAAG ACAGAAACCA 3 60 

GAATTTGCCA GGCACAGTGG CTCATGCCTG TAACCCCAGC ACCTTGGGAG GATCACTTGA 420 
GTCCAGG 427 



(2) INFORMATION FOR SEQ ID NO:38: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 367 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 23 of HIP1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

CCCCCTGAAT AGGTTAGAGT CTGGATTCTT TTCTGACTCT CTCAAGAATG TGGGCAGGGA 60 
CTTGGGGACT TCGAGATTCA GGTTTCCCAG CTACCACACG ATGTTGG AC T GAAAGTATAG 12 0 

TAAGACATTA GTGGATCCTT AATATTCAAG GCACATTTAG AAACCATGCT TCTTTTTCAC 18 0 

AGGAGATGCT CAGCAAATCC CGAGCAGGAG ACACAGGAGT CAAATTGGAG GTGAATGAAA 24 0 

GGTCGGTCTG AGCGGCATGG TGGGACCTAG GGGAGCAGGA TCTGTCTTCC TGACATTGGT 3 00 

CTATACTTTG CATACTTATT AGGGAATTAG AGGAGAGCAG TAGCAGCCAC GGGGAAGGGC 360 
TGAGTTG 3 67 

(2) INFORMATION FOR SEQ ID NO:39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 502 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 24 of HIP 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

CCCCGCAGAA TGTTCCAGCA ACCTCAGCAC CCTTCTTACC TCCCTTTCCC ATTCCAAGCT 6 0 

TGCCTTTGGC TAGGAGTGGG GAAGAGAACC GTCGTGTTCA TTGATCTTGG ATCTTGATCT 12 0 

CAGTGTATCC TCGACTTGTT TGTTTGGCAG GATCCTTGGT TGCTGTACCA GCCTCATGCA 18 0 

AGCTATTCAG GTGCTCATCG TGGCCTCTAA GGACCTCCAG AGAGAGATTG TGGAGAGCGG 240 

CAGGGTGAGC GTGGGTGTGG GCCCTGGGCA GGAAGAGGAG GCATCGGTGA CAGACTCCCG 3 00 

CTCCAACGGA CTCTGTGATG CTGCCGTCTT ACTCTGTGTG TCCACCTGAG TACAGAGCAG 3 60 

CCACTCCTGT AGATATCAGC AGAGGCCCTG GGGAGAAGTC AGAGCTCCAG GACCTCCCCA 42 0 

GAGGGTGGCC AGGCATGTGT CCCAACTCCA GCTCCCTTCG CACAGGCAGA CATTGTTGGA 480 

ACTTGCTGTG GGAGCCCTTT TT 502 



(2) INFORMATION FOR SEQ ID NO:40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 437 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
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(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 25 of HIP1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

TTTTGGTCTC TGAATCTTCT TCTTTTTTGT AAAATGGGAA TACTAATGCT TATGTCTCAG 6 0 

AGTTACTATG AGGATGATTT GGGATAATAT ATGTATAAAA GCACCTGCCA TATAGTACAT 12 0 

GCTCAATAAA AGGTGGCTAT TACTATTTTT TATTTCCCTA GGGTACAGCA TCCCCTAAAG 180 

AGTTTTATGC CAAGAACTCT C GATGG AC AG AAGGACTTAT CTCAGCCTCC AAGGCTGTGG 24 0 

GCTGGGGAGC CACTGTCATG GTGTAAGTAT CTATTGGTAC CAAGGGTCCT CCCATGACCC 3 00 

CTCTTCCATT GATCCACTCC AAACAATAGC TAAGGAGGGA AAAAAAAATC TGTCCCTTAG 3 60 

AAATAAACTA TTGATCAGGA AGTCAATAGG ACCGAGTTTA CAAGGGAGCC TGGCTCTCCC 42 0 

AGGGGACACA GGGCAGG 437 



(2) INFORMATION FOR SEQ ID NO:41: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 351 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 26 of HIP 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

GGGAGCCTGG CTCTCCCAGG GGACACAGGG CAGGCAGCCT CCCCTCCCTG TTTAGCCAAG 60 
GGCGATGGGG TGGTCTGGAG GTGGGATTGT GGAGGAGTTG CAGCTCATTT GCCCGTAACC 12 0 

TAGTCCCTCT TGTCGTTTTC CATCAGGGAT GCAGCTGATC TGGTGGTACA AGGCAGAGGG 180 
AAATTTGAGG AGCTAATGGT GTGTTCTCAT GAAATTGCTG CTAGCACAGC CCAGCTTGTG 240 
GCTGCATCCA AGGTAGGACC TGGCTGGACC TCCTAGGACG CTGGAAGGCC TGGTTAGAGA 3 00 

GTACTAGGCT AGGTTAAAGA GTACTTGGCT GCGTTAGGCA GTACTTGGCT G 3 51 

(2) INFORMATION FOR SEQ ID NO:42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 418 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 27 of HIP1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

CTTTTTATAT GATAGATATG TCAGGAGCTG ACTATAGTCA 
TGGTGATTGC CGTTTGGCCC ACATATGTTT GCTAAGAACC 
CAGTCCTTGT TGCTCTAGGT GTTGTATGAA CCTAAATCTG 



GCAGATTTTG AGAAGCTGAT 60 
ATCAGAGCAA TTATCTGATT 120 
CTTTGTCCTG GTAGGTGAAA 180 



42 



WO 99/60986 



PCT/US99/11743 



GCTGATAAGG ACAGCCCCAA CCTAGCCCAG CTGCAGCAGG CCTCTCGGGG AGTGAACCAG 24 0 

GCCACTGCCG GCGTTGTGGC CTCAACCATT TCCGGCAAAT CACAGATCGA AGAGACAGGT 3 00 

AGCCTTTCCA AAGGGACCCT TTTCTTACCC AC CCTGTTG A GCTCTTCTCT GCATCCTTCC 3 60 

CTGTGATCCC AACCAAATCC CACAGGACTG TGTCTAAATT CTTTCATATT TTTCATCT 418 



(2) INFORMATION FOR SEQ ID NO:43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 279 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 28 of HIP1 

(xi) SEQUENCE DESCRIPTION: SEQ ED NO: 43: 



TTTCCACAGA GCATTGGCAT TGGCTGCCTC TCAGGTGCCA GTCAGCCAGG GTAGAATTTG 60 

ATGAGACCTT CTTGTTTCCA TCCTTGCAGA CAACATGGAC TTCTCAAGCA TGACGCTGAC 120 

ACAGATCAAA CGCCAAGAGA TGGATTCTCA GGTTAGGGTG CTAGAGCTAG AAAATGAATT 18 0 

GCAGAAGGAG CGTCAAAAAC TGGGAGAGCT TCGGAAAAAG CACTACGAGC TTGCTGGTGT 240 

TGCTGAGGGC TGGGAAGAAG GTAAGCTGAC TCAAAGGAT 279 



(2) INFORMATION FOR SEQ ID NO:44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3715 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 29 and partial cds of HIP1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 



AACATAAATT ATCATTGTCT TTTAGGAACA GAGGCATCTC CACCTACACT GCAAGAAGTG 6 0 

GTAACCGAAA AAGAATAGAG CCAAACCAAC ACCCCATATG TCAGTGTAAA TCCTTGTTAC 120 

CTATCTCGTG TGTGTTATTT CCCCAGCCAC AGGCCAAATC CTTGGAGTCC CAGGGGCAGC 180 

CACACCACTG CCATTACCCA GTGCCGAGGA CATGCATGAC ACTTCCCAAA GACTCCCTCC 240 

ATAGCGACAC CCTTTCTGTT TGGACCCATG GTCATCTCTG TTCTTTTCCC GCCTCCCTAG 300 

TTAGCATCCA GGCTGGCCAG TGCTGCCCAT GAGCAAGCCT AGGTACGAAG AGGGGTGGTG 3 60 

GGGGGCAGGG CCACTCAACA GAGAGGACCA ACATCCAGTC CTGCTGACTA TTTGACCCCC 420 

ACAACAATGG GTATCCTTAA TAGAGGAGCT GCTTGTTGTT TGTTGACAGC TTGGAAAGGG 480 

AAGATCTTAT GCCTTTTCTT TTCTGTTTTC TTCTCAGTCT TTTCAGTTTC ATCATTTGCA 540 

CAAACTTGTG AGCATCAGAG GGCTGATGGA TTCCAAACCA GGACACTACC CTGAGATCTG 600 

CACAGTCAGA AGGACGGCAG GAGTGTCCTG GCTGTGAATG CCAAAGCCAT TCTCCCCCTC 660 

TTTGGGCAGT GCCATGGATT TCCACTGCTT CTTATGGTGG TTGGTTGGGT TTTTTGGTTT 720 

TGTTTTTTTT TTTTAAGTTT CACTCACATA GCCAACTCTC CCAAAGGGCA CACCCCTGGG 780 

GCTGAGTCTC CAGGGCCCCC CAACTGTGGT AGCTCCAGCG ATGGTGCTGC CCAGGCCTCT 840 
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QGGTGCTCCA TCTCCGCCTC CACACTGACC AAGTGCTGGC CCACCCAGTC CATGCTCCAG 90 0 

GGTCAGGCGG AGCTGCTGAG TGACAGCTTT CCTGAAAAAG CAGAAGGAGA GTGAGTGCCT 9 60 

TTCCCTCCTA AAGCTGAATC CCGGCGGAAA GCCTCTGTCC GCCTTTACAA GGGAGAAGAC 102 0 

AACAGAAAGA GGGACAAGAG GGTTCACACA GCCCAGTTCC CGTGACGAGG CTCAAAAACT 1080 

TGATCACATG CTTGAATGGA GCTGGTGAGA TCAACAACAC TACTTCCCTG CCGGAATGAA 114 0 

CTGTCCGTGA ATGGTCTCTG TCAAGCGGGC CGTCTCCCTT GGCCCAGAGA CGGAGTGTGG 12 0 0 

GAGTGATTCC CAACTC CTTT CTGCAGACGT CTGCCTTGGC ATCCTCTTGA ATAGGAAGAT 126 0 

CGTTCCACTT TCTACGCAAT TGACAAACCC GGAAGATCAG ATGCAATTGC TCCCATCAGG 132 0 

GAAGAACCCT ATACTTGGTT TGCTACCCTT AGTATTTATT ACTAACCTCC CTTAAGCAGC 13 8 0 

AACAGCCTAC AAAGAGATGC TTGGAGCAAT CAGAACTTCA GGTGTGACTC TAGCAAAGCT 144 0 

CATCTTTCTG CCCGGCTACA TCAGCCTTCA AGAATCAGAA GAAAGCCAAG GTGCTGGACT 150 0 

GTTACTGACT TGGATCCCAA AGCAAGGAGA T C ATTTGG AG CTCTTGGGTC AGAGAAAATG 1560 

AGAAAGGACA GAGCCAGCGG CTCCAACTCC TTTCAGCGAC ATGCCCCAGG CTCTCGCTGC 162 0 

CCTGTGGACA GGATGAGGAC AGAGGGCACA TGAACAGCTT GCCAGGGATG GGCAGCCCAA 168 0 

CAGCACTTTT CCTCTTCTAQ ATGGACCCCA GCATTTAAGT GACCTTCTGA TC TTGGG AAA 174 0 

ACAGCGTCTT CCTTCTTTAT QTATAGCAAC TCATTGGTGG TAGCCATCAA GCACTTCCCA 180 0 

GGATCTGCTC CAACAGAATA T^CTAGGTT TTGCTACATG ACGGGTTGTG AGACTTCTGT I8 6 0 

TTGATCACTG TGAACCAACC CCC^TCTCCC TAGCCCACCC CCCTCCCCAA CTCCCTCTCT 192 0 

GTGCATTTTC TAAGTGGGAC ATTCAAAAAA CTCTCTCCCA GGACCTCGGA TGACCATACT 198 0 

CAGACGTGTG ACCTCCATAC TGGGTTAAGG AAGTATCAGC ACTAGAAATT GGGCAGTCTT 204 0 

AATGTTGAAT GCTGCTTTCT GCTTAGTAT^T TTTTTGATTC AAGGCTCAGA AGGAATGGTG 210 0 

CGTGGCTTCC CTGTCCCAGT TGTGGCAAc\ AAACCAATCG GTGTGTTCTT GATGCGGGTC 2160 

AACATTTCCA AAAGTGGCTA GTCCTCACTtVtAGATCTCA GCCATTCTAA CTCATATGTT 222 0 

CCCAATTACC AAGGGGTGGC CGGGCACAGT GGCTCACGCC TGTAATCCCA GCACTTTGAG 22 8 0 

AGGCTGAGGT GGTAGGATCA CCTGAGGTCA GGAGTTCAAG ACCAGCCTGT CCAACATGGT 2340 

GAAACCCCCA TCTCTACTAA AAAT AC C AAA AATTAGC CGA GCGTAGTGAC GGGTGCCCGT 2400 

AATCCCAGCT ACTCAGGAGG CTGAGACAGG AGAAT^ACCT GAACCCCAGA GG C AG AGGTT 246 0 

GCAGTGAGCT GAGATCACGC CATTGTACTC CAGCCT^GC AACAAGAGCA AAACTCCGTC 252 0 

TCAAAAAAAA AAAAAAATTA CAAATGGGGC AAACAGTOTA GTGTAATGGA TCAAATTAAG 258 0 

ATTCTCTGCC CAGCCGGGCA CAGTGGCGCA TGCCTGTAAT CCCAGAACTT TGGGAGGCCA 264 0 

AG AC GGG ATG ATTGCTTGAG CTCAGGAGTT TGAGACCAGG CTGGGCATCA TAGCAAGACC 27 0 0 

TCATCTCTAC TAAAATTCAA AAACAAAATT AGCCGGGCAT GATGGTGC AT GCCTGTAGTC 27 6 0 

TCAGCTAGTT GGGGAGCTAA GGTGGGAGAA TTGCTTGAGC TTGGGAAGTC GAGGCTGCAG 2 82 0 

TCAGCCCTGA TTGTGCCAGT GCACTCCGGC CTGGGTGACA GAGTGAGACC CGTGCTCAAA 2 880 

AAAAAAAAGA TTCTGTGTCA GAGCCCAGCC CAGGAGTTTG AGGOTGC AAT GAGCCATGAT 2940 

TTCCCACTGC ACTCCAGCCT GAGTGACAGA GCGAGACTCC ATCTCTTTAA AAACAAACAA 300 0 

AAAATTATCT GAATGATCCT GTCTCTAAAA AGAAGCCACA GAAATCTTTA AAAACTTCAT 3 06 0 

C G AC TT AG C C TGAGTCATAA CGGTTAAGAA AGCACTTAAA CAGAAGc\gA GGCTAATTCA 3120 

GTGTCACATG AGGAAGTAGC TGTCAGATGT CACATAATTA CTTTCGTA^? AGCTCAGATT 318 0 

AGAATGGCTA CCCCATTCTC TAGACAAAAT CAAATTGTCC TATTGTGAC'k CTTCTAAAAA 3 24 0 

TGAAGATGAA GAGCTATTTA ATG AC AC AC C TTGGATTAAA ACGGGAATCa\aTCTTAAAG 33 00 

CTAAAAATGA ACCTGCAAGC CTTCTAAATG AGTCACTGAG CATCACTAGT GACAAGTCTC 33 60 

GGGTGAGCGT AAATGGGTCA TGACAAGATG GGACAGCAAC AAAATCATGG CTmGGATCG 342 0 

ACAAGAAGTT AAAAAACAGC TGCATCTGTT ACTTAAGTTT GTAAGACAGT GCCCTGAGAC 348 0 

CTCTAGAGAA AAGATGTTTG TTTACATAAG AGAAAGAAGG C C AG AC ATGG TGTCTC\CAC 3 54 0 

GTTTAATCCC AGCACTTTGG GAGGCAGGGG CGGGTGGATC ACCTGAGGTC AGGAGTTC^A 3 600 
GACTAGCCTG GCCAACATGG TGAAACCCCG TCTCTACTAA AAATACAAAA ATTAGCCGGG\ 3 66 0 

CATGGTGGCA GGCGCCTATA ATCCCAGCTA CTGGGGAGGC TGAGGCAGGA GAATC \^ 3715 
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